2856 Commits

Author SHA1 Message Date
Rafael J. Wysocki
3a47fbdd1a Merge branch 'thermal-intel'
Merge updates of Intel thermal drivers for v6.10:

 - Add missing MODULE_DESCRIPTION() to multiple files in the
   int340x_thermal and intel_soc_dts_iosf drivers (Srinivas Pandruvada).

 - Adjust the update delay and capabilities-per-event values in the
   Intel HFI thermal driver to prevent it from missing events and allow
   it to process more data in one go (Ricardo Neri).

* thermal-intel:
  thermal: intel: hfi: Increase the number of CPU capabilities per netlink event
  thermal: intel: hfi: Rename HFI_MAX_THERM_NOTIFY_COUNT
  thermal: intel: hfi: Shorten the thermal netlink event delay to 100ms
  thermal: intel: hfi: Rename HFI_UPDATE_INTERVAL
  thermal: intel: Add missing module description
2024-05-10 21:37:57 +02:00
Ricardo Neri
608fa85235 thermal: intel: hfi: Increase the number of CPU capabilities per netlink event
The number of updated CPU capabilities per netlink event is hard-coded to
16. On systems with more than 16 CPUs (a common case), it takes more than
one thermal netlink event to relay all the new capabilities after an HFI
interrupt. This adds unnecessary overhead to both the kernel and user space
entities.

Increase the number of CPU capabilities updated per event to 64. Any system
with 64 CPUs or less can now update all the capabilities in a single
thermal netlink event.

Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-08 14:02:02 +02:00
Ricardo Neri
07c6f3a7ff thermal: intel: hfi: Rename HFI_MAX_THERM_NOTIFY_COUNT
When processing a hardware update, HFI generates as many thermal netlink
events as needed to relay all the updated CPU capabilities to user space.
The constant HFI_MAX_THERM_NOTIFY_COUNT is the number of CPU capabilities
updated per each of those events.

Give this constant a more descriptive name.

Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-08 14:02:02 +02:00
Ricardo Neri
ba1a587ed6 thermal: intel: hfi: Shorten the thermal netlink event delay to 100ms
The delay between an HFI interrupt and its corresponding thermal netlink
event has so far been hard-coded to CONFIG_HZ jiffies (1 second). This
delay is too long for hardware that generates updates every tens of
milliseconds.

The HFI driver uses a delayed workqueue to send thermal netlink events. No
subsequent events will be sent if there is pending work.

As a result, much of the information of consecutive hardware updates will
be lost if the workqueue delay is too long. User space entities may act on
obsolete data. If the delay is too short, multiple events may overwhelm
listeners.

Set the delay to 100ms to strike a balance between too many and too few
events. Use milliseconds instead of jiffies to improve readability.

Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-08 14:02:02 +02:00
Ricardo Neri
564a88eb7a thermal: intel: hfi: Rename HFI_UPDATE_INTERVAL
The name of the constant HFI_UPDATE_INTERVAL is misleading. It is not a
periodic interval at which HFI updates are processed. It is the delay in
the processing of an HFI update after the arrival of an HFI interrupt.

Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-08 14:02:02 +02:00
Rafael J. Wysocki
9396b2a669 Merge branch 'thermal-core'
This includes a major rework of thermal governors and part of the
thermal core interacting with them as well as some fixes and cleanups
of the thermal debug code:

 - Redesign the thermal governor interface to allow the governors to
   work in a more straightforward way.

 - Make thermal governors take the current trip point thresholds into
   account in their computations which allows trip hysteresis to be
   observed more accurately.

 - Clean up thermal governors.

 - Make the thermal core manage passive polling for thermal zones and
   remove passive polling management from thermal governors.

 - Improve the handling of cooling device states and thermal mitigation
   episodes in progress in the thermal debug code.

 - Avoid excessive updates of trip point statistics and clean up the
   printing of thermal mitigation episode information.

* thermal-core: (27 commits)
  thermal: core: Move passive polling management to the core
  thermal: core: Do not call handle_thermal_trip() if zone temperature is invalid
  thermal: trip: Add missing empty code line
  thermal/debugfs: Avoid printing zero duration for mitigation events in progress
  thermal/debugfs: Pass cooling device state to thermal_debug_cdev_add()
  thermal/debugfs: Create records for cdev states as they get used
  thermal: core: Introduce thermal_governor_trip_crossed()
  thermal/debugfs: Make tze_seq_show() skip invalid trips and trips with no stats
  thermal/debugfs: Rename thermal_debug_update_temp() to thermal_debug_update_trip_stats()
  thermal/debugfs: Clean up thermal_debug_update_temp()
  thermal/debugfs: Avoid excessive updates of trip point statistics
  thermal: core: Relocate critical and hot trip handling
  thermal: core: Drop the .throttle() governor callback
  thermal: gov_user_space: Use .trip_crossed() instead of .throttle()
  thermal: gov_fair_share: Eliminate unnecessary integer divisions
  thermal: gov_fair_share: Use trip thresholds instead of trip temperatures
  thermal: gov_fair_share: Use .manage() callback instead of .throttle()
  thermal: gov_step_wise: Clean up thermal_zone_trip_update()
  thermal: gov_step_wise: Use trip thresholds instead of trip temperatures
  thermal: gov_step_wise: Use .manage() callback instead of .throttle()
  ...
2024-05-06 12:24:30 +02:00
Rafael J. Wysocki
0021102527 Merge back thermal cotntrol material for v6.10. 2024-05-06 12:23:50 +02:00
Srinivas Pandruvada
48d722fd39 thermal: intel: Add missing module description
Fix warnings displayed by "make W=1" build:

WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/thermal/intel/intel_soc_dts_iosf.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/thermal/intel/int340x_thermal/processor_thermal_rapl.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/thermal/intel/int340x_thermal/processor_thermal_rfim.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/thermal/intel/int340x_thermal/processor_thermal_mbox.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/thermal/intel/int340x_thermal/processor_thermal_wt_req.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/thermal/intel/int340x_thermal/processor_thermal_wt_hint.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/thermal/intel/int340x_thermal/processor_thermal_power_floor

Reported-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-02 12:51:48 +02:00
Rafael J. Wysocki
042a3d80f1 thermal: core: Move passive polling management to the core
Passive polling is enabled by setting the 'passive' field in
struct thermal_zone_device to a positive value so long as the
'passive_delay_jiffies' field is greater than zero.  It causes
the thermal core to actively check the thermal zone temperature
periodically which in theory should be done after crossing a
passive trip point on the way up in order to allow governors to
react more rapidly to temperature changes and adjust mitigation
more precisely.

However, the 'passive' field in struct thermal_zone_device is currently
managed by governors which is quite problematic.  First of all, only
two governors, Step-Wise and Power Allocator, update that field at
all, so the other governors do not benefit from passive polling,
although in principle they should.  Moreover, if the zone governor is
changed from, say, Step-Wise to Fair-Share after 'passive' has been
incremented by the former, it is not going to be reset back to zero by
the latter even if the zone temperature falls down below all passive
trip points.

For this reason, make handle_thermal_trip() increment 'passive'
to enable passive polling for the given thermal zone whenever a
passive trip point is crossed on the way up and decrement it
whenever a passive trip point is crossed on the way down.  Also
remove the 'passive' field updates from governors and additionally
clear it in thermal_zone_device_init() to prevent passive polling
from being enabled after a system resume just beacuse it was enabled
before suspending the system.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-30 21:16:13 +02:00
Rafael J. Wysocki
202aa0d4bb thermal: core: Do not call handle_thermal_trip() if zone temperature is invalid
Make __thermal_zone_device_update() bail out if update_temperature()
fails to update the zone temperature because __thermal_zone_get_temp()
has returned an error and the current zone temperature is
THERMAL_TEMP_INVALID (user space receiving netlink thermal messages,
thermal debug code and thermal governors may get confused otherwise).

Fixes: 9ad18043fb35 ("thermal: core: Send trip crossing notifications at init time if needed")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-30 21:16:13 +02:00
Rafael J. Wysocki
1502718a66 thermal: trip: Add missing empty code line
Add missing empty line of code to thermal_zone_trip_id().

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-30 13:04:56 +02:00
Rafael J. Wysocki
bd700ba9ad thermal/debugfs: Avoid printing zero duration for mitigation events in progress
If a thermal mitigation event is in progress, its duration value has
not been updated yet, so 0 will be printed as the event duration by
tze_seq_show() which is confusing.

Avoid doing that by marking the beginning of the event with the
KTIME_MIN duration value and making tze_seq_show() compute the current
event duration on the fly, in which case '>' will be printed instead of
'=' in the event duration value field.

Similarly, for trip points that have been crossed on the down, mark
the end of mitigation with the KTIME_MAX timestamp value and make
tze_seq_show() compute the current duration on the fly for the trip
points still involved in the mitigation, in which cases the duration
value printed by it will be prepended with a '>' character.

Fixes: 7ef01f228c9f ("thermal/debugfs: Add thermal debugfs information for mitigation episodes")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-26 15:01:56 +02:00
Rafael J. Wysocki
31a0fa0019 thermal/debugfs: Pass cooling device state to thermal_debug_cdev_add()
If cdev_dt_seq_show() runs before the first state transition of a cooling
device, it will not print any state residency information for it, even
though it might be reasonably expected to print residency information for
the initial state of the cooling device.

For this reason, rearrange the code to get the initial state of a cooling
device at the registration time and pass it to thermal_debug_cdev_add(),
so that the latter can create a duration record for that state which will
allow cdev_dt_seq_show() to print its residency information.

Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information")
Reported-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-26 15:01:56 +02:00
Rafael J. Wysocki
f4ae18fcb6 thermal/debugfs: Create records for cdev states as they get used
Because thermal_debug_cdev_state_update() only creates a duration record
for the old state of a cooling device, if its new state is used for the
first time, there will be no record for it and cdev_dt_seq_show() will
not print the duration information for it even though it contains code
to compute the duration value in that case.

Address this by making thermal_debug_cdev_state_update() create a
duration record for the new state if there is none.

Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information")
Reported-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-26 15:01:56 +02:00
Rafael J. Wysocki
8c882f172f Merge back earlier thermal core changes for v6.10. 2024-04-26 15:00:56 +02:00
Rafael J. Wysocki
d351eb0ab0 thermal/debugfs: Prevent use-after-free from occurring after cdev removal
Since thermal_debug_cdev_remove() does not run under cdev->lock, it can
run in parallel with thermal_debug_cdev_state_update() and it may free
the struct thermal_debugfs object used by the latter after it has been
checked against NULL.

If that happens, thermal_debug_cdev_state_update() will access memory
that has been freed already causing the kernel to crash.

Address this by using cdev->lock in thermal_debug_cdev_remove() around
the cdev->debugfs value check (in case the same cdev is removed at the
same time in two different threads) and its reset to NULL.

Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information")
Cc :6.8+ <stable@vger.kernel.org> # 6.8+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-26 14:57:50 +02:00
Rafael J. Wysocki
c7f7c37271 thermal/debugfs: Fix two locking issues with thermal zone debug
With the current thermal zone locking arrangement in the debugfs code,
user space can open the "mitigations" file for a thermal zone before
the zone's debugfs pointer is set which will result in a NULL pointer
dereference in tze_seq_start().

Moreover, thermal_debug_tz_remove() is not called under the thermal
zone lock, so it can run in parallel with the other functions accessing
the thermal zone's struct thermal_debugfs object.  Then, it may clear
tz->debugfs after one of those functions has checked it and the
struct thermal_debugfs object may be freed prematurely.

To address the first problem, pass a pointer to the thermal zone's
struct thermal_debugfs object to debugfs_create_file() in
thermal_debug_tz_add() and make tze_seq_start(), tze_seq_next(),
tze_seq_stop(), and tze_seq_show() retrieve it from s->private
instead of a pointer to the thermal zone object.  This will ensure
that tz_debugfs will be valid across the "mitigations" file accesses
until thermal_debugfs_remove_id() called by thermal_debug_tz_remove()
removes that file.

To address the second problem, use tz->lock in thermal_debug_tz_remove()
around the tz->debugfs value check (in case the same thermal zone is
removed at the same time in two different threads) and its reset to NULL.

Fixes: 7ef01f228c9f ("thermal/debugfs: Add thermal debugfs information for mitigation episodes")
Cc :6.8+ <stable@vger.kernel.org> # 6.8+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-26 11:11:30 +02:00
Rafael J. Wysocki
72c1afffa4 thermal/debugfs: Free all thermal zone debug memory on zone removal
Because thermal_debug_tz_remove() does not free all memory allocated for
thermal zone diagnostics, some of that memory becomes unreachable after
freeing the thermal zone's struct thermal_debugfs object.

Address this by making thermal_debug_tz_remove() free all of the memory
in question.

Fixes: 7ef01f228c9f ("thermal/debugfs: Add thermal debugfs information for mitigation episodes")
Cc :6.8+ <stable@vger.kernel.org> # 6.8+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-26 11:11:00 +02:00
Rafael J. Wysocki
f831892e23 thermal: core: Introduce thermal_governor_trip_crossed()
Add a wrapper around the .trip_crossed() governor callback invocation
to reduce code duplications slightly and improve the code layout in
__thermal_zone_device_update().

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-24 20:45:06 +02:00
Rafael J. Wysocki
a6258fde8d thermal/debugfs: Make tze_seq_show() skip invalid trips and trips with no stats
Currently, tze_seq_show() output includes all of the trips in the zone
except for critical ones, including invalid trips and trips with no stats
which is confusing.

Make it skip the trips for which there is not mitigation information.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-24 20:43:11 +02:00
Rafael J. Wysocki
8dff6e8438 thermal/debugfs: Rename thermal_debug_update_temp() to thermal_debug_update_trip_stats()
Rename thermal_debug_update_temp() to thermal_debug_update_trip_stats()
which is a better match for the purpose of the function.

No functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-24 20:43:09 +02:00
Rafael J. Wysocki
e271f9974d thermal/debugfs: Clean up thermal_debug_update_temp()
Notice that it is not necessary to compute tze in every iteration of the
for () loop in thermal_debug_update_temp() because it is the same for all
trips, so compute it once before the loop starts.

Also use a trip_stats local variable to make the code in that loop easier
to follow and move the trip_id variable definition into that loop because
it is not used elsewhere in the function.

While at it, change to order of local variable definitions in the function
to follow the reverse-xmas-tree pattern.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-24 20:43:06 +02:00
Rafael J. Wysocki
0a293c7758 thermal/debugfs: Avoid excessive updates of trip point statistics
Since thermal_debug_update_temp() is called before invoking
thermal_debug_tz_trip_down() for the trips that were crossed by the
zone temperature on the way up, it updates the statistics for them
as though the current zone temperature was above the low temperature
of each of them.  However, if a given trip has just been crossed on the
way down, the zone temperature is in fact below its low temperature,
but this is handled by thermal_debug_tz_trip_down() running after the
update of the trip statistics.

The remedy is to call thermal_debug_update_temp() after
thermal_debug_tz_trip_down() has been invoked for all of the
trips in question, but then thermal_debug_tz_trip_up() needs to
be adjusted, so it does not update the statistics for the trips
that has just been crossed on the way up, as that will be taken
care of by thermal_debug_update_temp() down the road.

Modify the code accordingly.

Fixes: 7ef01f228c9f ("thermal/debugfs: Add thermal debugfs information for mitigation episodes")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-24 20:43:02 +02:00
Rafael J. Wysocki
2ae0998c67 thermal: core: Relocate critical and hot trip handling
Modify handle_thermal_trip() to call handle_critical_trips() only after
finding that the trip temperature has been crossed on the way up and
remove the redundant temperature check from the latter.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-24 20:42:39 +02:00
Rafael J. Wysocki
ad2f8bccd0 thermal: core: Drop the .throttle() governor callback
Since all of the governors in the tree have been switched over to using
the new callbacks, either .trip_crossed() or .manage(), the .throttle()
governor callback is not used any more, so drop it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-24 20:42:31 +02:00
Rafael J. Wysocki
c1beda1cfc thermal: gov_user_space: Use .trip_crossed() instead of .throttle()
Notifying user space about trip points that have not been crossed is
not particularly useful, so modify the User Space governor to use the
.trip_crossed() callback, which is only invoked for trips that have been
crossed, instead of .throttle() that is invoked for all trips in a
thermal zone every time the zone is updated.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-24 20:42:10 +02:00
Rafael J. Wysocki
c98e24795e thermal: gov_fair_share: Eliminate unnecessary integer divisions
The computations carried out by fair_share_throttle() for each trip
point include at least one redundant integer division which introduces
superfluous rounding errors.  Also the multiplications by 100 in it are
not really necessary and can be eliminated.

Rearrange fair_share_throttle() to carry out only one integer division per
trip and only as many integer multiplications as necessary and rename one
variable in it (while at it).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-24 10:15:08 +02:00
Rafael J. Wysocki
0292991ce4 thermal: gov_fair_share: Use trip thresholds instead of trip temperatures
In principle, the Fair Share governor should take trip hysteresis
into account.  After all, once a trip has been crossed on the way up,
mitigation is still needed until it is crossed on the way down.

For this reason, make it use trip thresholds that are computed by
the core when trips are crossed, so as to apply mitigations if the
zone temperature is in a hysteresis rage of one or more trips that
were crossed on the way up, but have not been crossed on the way
down yet.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-24 10:15:06 +02:00
Rafael J. Wysocki
bec55332c2 thermal: gov_fair_share: Use .manage() callback instead of .throttle()
The Fair Share governor tries very hard to be stateless and so it
calls get_trip_level() from fair_share_throttle() every time, even
though the number produced by this function for all of the trips
during a given thermal zone update is actually the same.  Since
get_trip_level() walks all of the trips in the thermal zone every
time it is called, doing this may generate quite a bit of completely
useless overhead.

For this reason, make the governor use the new .manage() callback
instead of .throttle() which allows it to call get_trip_level() just
once and use the value computed by it to handle all of the trips.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-24 10:15:04 +02:00
Rafael J. Wysocki
fe03626650 thermal: gov_step_wise: Clean up thermal_zone_trip_update()
Do some assorted cleanups in thermal_zone_trip_update():

 * Compute the trend value upfront.
 * Move old_target definition to the block where it is used.
 * Adjust white space around diagnostic messages and locking.
 * Use suitable field formatting in a message to avoid an explicit
   cast to int.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-24 10:14:58 +02:00
Rafael J. Wysocki
e4065f144f thermal: gov_step_wise: Use trip thresholds instead of trip temperatures
In principle, the Step-Wise governor should take trip hysteresis into
account.  After all, once a trip has been crossed on the way up,
mitigation is still needed until it is crossed on the way down.

For this reason, make it use trip thresholds that are computed by
the core when trips are crossed, so as to apply mitigations in the
hysteresis rages of trips that were crossed on the way up, but have
not been crossed on the way down yet.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-24 10:14:33 +02:00
Rafael J. Wysocki
a6ce8c7da5 thermal: gov_step_wise: Use .manage() callback instead of .throttle()
Make the Step-Wise governor use the new .manage() callback instead of
.throttle().

Even though using .throttle() is not particularly problematic for the
Step-Wise governor, using .manage() instead still allows it to reduce
overhead by updating all of the cooling devices once after setting
target values for all of the thermal instances.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-24 10:13:53 +02:00
Rafael J. Wysocki
ca0e9728d3 thermal: gov_power_allocator: Eliminate a redundant variable
Notice that the passive field in struct thermal_zone_device is not
used by the Power Allocator governor itself and so the ordering of
its updates with respect to allow_maximum_power() or allocate_power()
does not matter.

Accordingly, make power_allocator_manage() update that field right
before returning, which allows the current value of it to be passed
directly to allow_maximum_power() without using the additional update
variable that can be dropped.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-23 20:39:50 +02:00
Rafael J. Wysocki
41ddbcc6fd thermal: gov_power_allocator: Use .manage() callback instead of .throttle()
The Power Allocator governor really only wants to be called once per
thermal zone update and it does a special check to skip the extra,
from its perspective, invocations of the .throttle() callback.

Make it use .manage() instead of .throttle().

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-23 20:39:50 +02:00
Rafael J. Wysocki
976f44133f thermal: core: Introduce .manage() callback for thermal governors
Introduce a new thermal governor callback called .manage() that will be
invoked once per thermal zone update after processing all of the trip
points in the core.

This will allow governors that look at multiple trip points together
to check all of them in a consistent configuration, so they don't need
to play tricks with skipping .throttle() invocations that they are not
interested in and they can avoid carrying out the same computations for
multiple times in one cycle.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-23 20:38:41 +02:00
Rafael J. Wysocki
0ae204a667 thermal: gov_bang_bang: Fold thermal_zone_trip_update() into its caller
Fold thermal_zone_trip_update() into bang_bang_control() which is the
only caller of it to reduce code size and make it easier to follow.

No functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-23 20:38:26 +02:00
Rafael J. Wysocki
4526c58109 thermal: gov_bang_bang: Clean up thermal_zone_trip_update()
Do the following cleanups in thermal_zone_trip_update():

 * Drop the useless "zero hysteresis" message.
 * Eliminate the trip_index local variable that is redundant.
 * Drop 2 comments that are not useful.
 * Downgrade a diagnostic message from pr_warn() to pr_debug().
 * Use consistent field formatting in diagnostic messages.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-23 20:38:06 +02:00
Rafael J. Wysocki
530c932bdf thermal: gov_bang_bang: Use .trip_crossed() instead of .throttle()
The Bang-Bang governor really is only concerned about trip point
crossing, so it can use the new .trip_crossed() callback instead of
.throttle() that is not particularly suitable for it.

Modify it to do so which also takes trip hysteresis into account, so the
governor does not need to use it directly any more.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-23 20:37:41 +02:00
Binbin Zhou
734b5def91 thermal/drivers/loongson2: Add Loongson-2K2000 support
The Loongson-2K2000 and Loongson-2K1000 have similar thermal sensors,
except that the temperature is read differently.

In particular, the temperature output registers of the Loongson-2K2000
are defined in the chip configuration domain and are read in a different
way.

Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Acked-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/fdbfdcc3231a36a4ee0bcf1377ddcbd6f8c944b5.1713837379.git.zhoubinbin@loongson.cn
2024-04-23 12:40:30 +02:00
Binbin Zhou
6ef6d5ff5e thermal/drivers/loongson2: Trivial code style adjustment
Here are some minor code style adjustment. Such as fix whitespace code
style; align function call arguments to opening parenthesis.

Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Acked-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/ccca50f2ad3fd8c84fcbfcb1f875427ea7f637a0.1713837379.git.zhoubinbin@loongson.cn
2024-04-23 12:40:30 +02:00
Nicolas Pitre
f4745f546e thermal/drivers/mediatek/lvts_thermal: Add MT8188 support
Various values extracted from the vendor's kernel driver.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240402032729.2736685-15-nico@fluxnic.net
2024-04-23 12:40:30 +02:00
Nicolas Pitre
11e6f4c314 thermal/drivers/mediatek/lvts_thermal: Allow early empty sensor slots
Some systems don't always populate sensor controller slots starting
at slot 0. Use a bitmap instead of a count to indicate valid sensor
slots. Also create a pretty iterator for that.

About that iterator: it causes checkpatch to complain with "ERROR:
Macros with multiple statements should be enclosed in a do - while
loop". However this is not possible here. And many similar iterators
do exist using the same form in the tree already.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240402032729.2736685-12-nico@fluxnic.net
2024-04-23 12:40:30 +02:00
Nicolas Pitre
684cbb49f9 thermal/drivers/mediatek/lvts_thermal: Provision for gt variable location
The golden temperature calibration value in nvram is not always the
3rd byte. A future commit will prove this assumption wrong.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240402032729.2736685-11-nico@fluxnic.net
2024-04-23 12:40:30 +02:00
Nicolas Pitre
a4c1ab2f4c thermal/drivers/mediatek/lvts_thermal: Add MT8186 support
Various values extracted from the vendor's kernel driver.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240402032729.2736685-9-nico@fluxnic.net
2024-04-23 12:40:30 +02:00
Nicolas Pitre
2cc0b1a216 thermal/drivers/mediatek/lvts_thermal: Guard against efuse data buffer overflow
We don't want to silently fetch garbage past the actual buffer.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240402032729.2736685-6-nico@fluxnic.net
2024-04-23 12:40:29 +02:00
Nicolas Pitre
5b3367e28a thermal/drivers/mediatek/lvts_thermal: Use offsets for every calibration byte
Current code assumes calibration values are always stored contiguously
in host endian order. A future patch will prove this wrong.

Let's specify the offset for each calibration byte instead.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240402032729.2736685-5-nico@fluxnic.net
2024-04-23 12:40:29 +02:00
Nicolas Pitre
554bca3130 thermal/drivers/mediatek/lvts_thermal: Remove .hw_tshut_temp
All the .hw_tshut_temp instances are initialized with the same value.
Let's remove those and use a common definition instead. If ever a
different value must be used in the future then an override parameter
could be added back.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240402032729.2736685-4-nico@fluxnic.net
2024-04-23 12:40:29 +02:00
Nicolas Pitre
62194e637d thermal/drivers/mediatek/lvts_thermal: Move comment
Move efuse data interpretation inside lvts_golden_temp_init() alongside
the actual code retrieving wanted value.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240402032729.2736685-3-nico@fluxnic.net
2024-04-23 12:40:29 +02:00
Nicolas Pitre
8c25958f71 thermal/drivers/mediatek/lvts_thermal: Retrieve all calibration bytes
Calibration values are 24-bit wide. Those values so far appear to span
only 16 bits but let's not push our luck.

Found while looking at the original Mediatek driver code.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240402032729.2736685-2-nico@fluxnic.net
2024-04-23 12:40:29 +02:00
Christophe JAILLET
61fad0a906 thermal/drivers/k3_bandgap: Remove some unused fields in struct k3_bandgap
In "struct k3_bandgap", the 'conf' field is unused.
Remove it.

Found with cppcheck, unusedStructMember.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/206d39bed9ec6df9b4d80b1fc064e499389fc7fc.1712687420.git.christophe.jaillet@wanadoo.fr
2024-04-23 12:40:29 +02:00