IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
commit 0a5c26712f upstream.
When device_register() return failed, program will goto out_kfree_type
to release 'cdev->device' by put_device(). That will call thermal_release()
to free 'cdev'. But the follow-up processes access 'cdev' continually.
That trggers the UAF bug.
====================================================================
BUG: KASAN: use-after-free in __thermal_cooling_device_register+0x75b/0xa90
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
Call Trace:
dump_stack_lvl+0xe2/0x152
print_address_description.constprop.0+0x21/0x140
? __thermal_cooling_device_register+0x75b/0xa90
kasan_report.cold+0x7f/0x11b
? __thermal_cooling_device_register+0x75b/0xa90
__thermal_cooling_device_register+0x75b/0xa90
? memset+0x20/0x40
? __sanitizer_cov_trace_pc+0x1d/0x50
? __devres_alloc_node+0x130/0x180
devm_thermal_of_cooling_device_register+0x67/0xf0
max6650_probe.cold+0x557/0x6aa
......
Freed by task 258:
kasan_save_stack+0x1b/0x40
kasan_set_track+0x1c/0x30
kasan_set_free_info+0x20/0x30
__kasan_slab_free+0x109/0x140
kfree+0x117/0x4c0
thermal_release+0xa0/0x110
device_release+0xa7/0x240
kobject_put+0x1ce/0x540
put_device+0x20/0x30
__thermal_cooling_device_register+0x731/0xa90
devm_thermal_of_cooling_device_register+0x67/0xf0
max6650_probe.cold+0x557/0x6aa [max6650]
Do not use 'cdev' again after put_device() to fix the problem like doing
in thermal_zone_device_register().
[dlezcano]: as requested by Rafael, change the affectation into two statements.
Fixes: 5848376181 ("thermal/drivers/core: Use a char pointer for the cooling device name")
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/r/20211015024504.947520-1-william.xuanziyang@huawei.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5848376181 ]
We want to have any kind of name for the cooling devices as we do no
longer want to rely on auto-numbering. Let's replace the cooling
device's fixed array by a char pointer to be allocated dynamically
when registering the cooling device, so we don't limit the length of
the name.
Rework the error path at the same time as we have to rollback the
allocations in case of error.
Tested with a dummy device having the name:
"Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch"
A village on the island of Anglesey (Wales), known to have the longest
name in Europe.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20210314111333.16551-1-daniel.lezcano@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 99b63316c3 ]
During the suspend is in process, thermal_zone_device_update bails out
thermal zone re-evaluation for any sensor trip violation without
setting next valid trip to that sensor. It assumes during resume
it will re-evaluate same thermal zone and update trip. But when it is
in suspend temperature goes down and on resume path while updating
thermal zone if temperature is less than previously violated trip,
thermal zone set trip function evaluates the same previous high and
previous low trip as new high and low trip. Since there is no change
in high/low trip, it bails out from thermal zone set trip API without
setting any trip. It leads to a case where sensor high trip or low
trip is disabled forever even though thermal zone has a valid high
or low trip.
During thermal zone device init, reset thermal zone previous high
and low trip. It resolves above mentioned scenario.
Signed-off-by: Manaf Meethalavalappu Pallikunhi <manafm@codeaurora.org>
Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1bb30b20b4 ]
After printing the list of thermal governors, then this function prints
a newline character. The problem is that "size" has not been updated
after printing the last governor. This means that it can write one
character (the NUL terminator) beyond the end of the buffer.
Get rid of the "size" variable and just use "PAGE_SIZE - count" directly.
Fixes: 1b4f48494e ("thermal: core: group functions related to governor handling")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210916131342.GB25094@kili
Signed-off-by: Sasha Levin <sashal@kernel.org>
1. devfreq_cooling.c: The variable *tz is not used in
devfreq_cooling_get_requested_power(), devfreq_cooling_state2power()
and devfreq_cooling_power2state().
2. cpufreq_cooling.c: After 84fe2cab48, the variable *tz is not used
anymore in cpufreq_get_requested_power(), cpufreq_state2power() and
cpufreq_power2state().
Remove the variable *tz.
Signed-off-by: zhuguangqing <zhuguangqing@xiaomi.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200914071101.13575-1-zhuguangqing83@gmail.com
The user-after-free bug in thermal_zone_device_unregister() is reported by
KASAN. It happens because struct thermal_zone_device is released during of
device_unregister() invocation, and hence the "tz" variable shouldn't be
touched by thermal_notify_tz_delete(tz->id).
Fixes: 55cdf0a283 ("thermal: core: Add notifications call in the framework")
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200817235854.26816-1-digetx@gmail.com
When a thermal zone is looked up by an ID and no zone is found matching
that ID, the thermal_zone_get_by_id() function will return a pointer to
the thermal zone list head which isn't actually a valid thermal zone.
This can lead to a subsequent crash because a valid pointer is returned
to the called, but dereferencing that pointer as struct thermal_zone is
not safe.
Fixes: 329b064fbd ("thermal: core: Get thermal zone by id")
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200724170105.2705467-1-thierry.reding@gmail.com
The generic netlink is initialized at subsys_initcall, so far after
the thermal init routine and the thermal generic netlink family
initialization.
On ŝome platforms, that leads to a memory corruption.
The fix was sent to netdev@ to move the genetlink framework
initialization at core_initcall.
Move the thermal core initialization to postcore level which is very
close to core level.
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Link: https://lore.kernel.org/r/20200717164217.18819-2-daniel.lezcano@linaro.org
The initcalls like to play joke. In our case, the thermal-netlink
initcall is called after the thermal-core initcall but this one sends
a notification before the former is initialized. No issue was spotted,
but it could lead to a memory corruption, so instead of relying on the
core_initcall for the thermal-netlink, let's initialize directly from
the thermal-core init routine, so we have full control of the init
ordering.
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Link: https://lore.kernel.org/r/20200717164217.18819-1-daniel.lezcano@linaro.org
The next patch will introduce the generic netlink protocol to handle
events, sampling and command from the thermal framework. In order to
deal with the thermal zone, it uses its unique identifier to
characterize it in the message. Passing an integer is more efficient
than passing an entire string.
This change provides a function returning back a thermal zone pointer
corresponding to the identifier passed as parameter.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20200706105538.2159-2-daniel.lezcano@linaro.org
The cdev, tz and governor list, as well as their respective locks are
statically defined in the thermal_core.c file.
In order to give a sane access to these list, like browsing all the
thermal zones or all the cooling devices, let's define a set of
helpers where we pass a callback as a parameter to be called for each
thermal entity.
We keep the self-encapsulation and ensure the locks are correctly
taken when looking at the list.
Acked-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200706105538.2159-1-daniel.lezcano@linaro.org
set_mode() is only called when tzd's mode is about to change. Actual
setting is performed in thermal_core, in thermal_zone_device_set_mode().
The meaning of set_mode() callback is actually to notify the driver about
the mode being changed and giving the driver a chance to oppose such
change.
To better reflect the purpose of the method rename it to change_mode()
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
[for acerhdf]
Acked-by: Peter Kaestle <peter@piie.net>
Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200629122925.21729-12-andrzej.p@collabora.com
Use thermal_zone_device_{en|dis}able() and thermal_zone_device_is_enabled().
Consequently, all set_mode() implementations in drivers:
- can stop modifying tzd's "mode" member,
- shall stop taking tzd's lock, as it is taken in the helpers
- shall stop calling thermal_zone_device_update() as it is called in the
helpers
- can assume they are called when the mode truly changes, so checks to
verify that can be dropped
Not providing set_mode() by a driver no longer prevents the core from
being able to set tzd's mode, so the relevant check in mode_store() is
removed.
Other comments:
- acpi/thermal.c: tz->thermal_zone->mode will be updated only after we
return from set_mode(), so use function parameter in thermal_set_mode()
instead, no need to call acpi_thermal_check() in set_mode()
- thermal/imx_thermal.c: regmap writes and mode assignment are done in
thermal_zone_device_{en|dis}able() and set_mode() callback
- thermal/intel/intel_quark_dts_thermal.c: soc_dts_{en|dis}able() are a
part of set_mode() callback, so they don't need to modify tzd->mode, and
don't need to fall back to the opposite mode if unsuccessful, as the return
value will be propagated to thermal_zone_device_{en|dis}able() and
ultimately tzd's member will not be changed in thermal_zone_device_set_mode().
- thermal/of-thermal.c: no need to set zone->mode to DISABLED in
of_parse_thermal_zones() as a tzd is kzalloc'ed so mode is DISABLED anyway
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
[for acerhdf]
Acked-by: Peter Kaestle <peter@piie.net>
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200629122925.21729-8-andrzej.p@collabora.com
When registering a thermal zone device, we currently return -EINVAL in
four cases. This makes it a little hard to debug the real cause of the
failure.
Print some error messages to make it easier for developer to figure out
what happened.
Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Never directly free @dev after calling device_register(), even if it
returned an error! Always use put_device() to give up the reference
initialized. Clean up the rollback block also.
Signed-off-by: Yue Hu <huyue2@yulong.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Now that the governor table is in place and the macro allows to browse the
table, declare the governor so the entry is added in the governor table
in the init section.
The [un]register_thermal_governors function does no longer need to use the
exported [un]register thermal governor's specific function which in turn
call the [un]register_thermal_governor. The governors are fully
self-encapsulated.
The cyclic dependency is no longer needed, remove it.
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Pull thermal management updates from Zhang Rui:
- Remove the 'module' Kconfig option for thermal subsystem framework
because the thermal framework are required to be ready as early as
possible to avoid overheat at boot time (Daniel Lezcano)
- Fix a bug that thermal framework pokes disabled thermal zones upon
resume (Wei Wang)
- A couple of cleanups and trivial fixes on int340x thermal drivers
(Srinivas Pandruvada, Zhang Rui, Sumeet Pawnikar)
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
drivers: thermal: processor_thermal: Downgrade error message
mlxsw: Remove obsolete dependency on THERMAL=m
hwmon/drivers/core: Simplify complex dependency
thermal/drivers/core: Fix typo in the option name
thermal/drivers/core: Remove depends on THERMAL in Kconfig
thermal/drivers/core: Remove module unload code
thermal/drivers/core: Remove the module Kconfig's option
thermal: core: skip update disabled thermal zones after suspend
thermal: make device_register's type argument const
thermal: intel: int340x: processor_thermal_device: simplify to get driver data
thermal/int3403_thermal: favor _TMP instead of PTYP
thermal_of_cooling_device_register() and thermal_cooling_device_register()
are typically called from driver probe functions, and
thermal_cooling_device_unregister() is called from remove functions. This
makes both a perfect candidate for device managed functions.
Introduce devm_thermal_of_cooling_device_register(). This function can
also be used to replace thermal_cooling_device_register() by passing a NULL
pointer as device node. The new function requires both struct device *
and struct device_node * as parameters since the struct device_node *
parameter is not always identical to dev->of_node.
Don't introduce a device managed remove function since it is not needed
at this point.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Now the thermal core is no longer compiled as a module. Remove the
unloading module code and move the unregister function to the __init
section.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
It is unnecessary to update disabled thermal zones post suspend and
sometimes leads error/warning in bad behaved thermal drivers.
Signed-off-by: Wei Wang <wvw@google.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
...because it can be, the buffer is strlcpy'd into a local buffer in a
thermal struct member.
Signed-off-by: Jean-Francois Dagenais <jeff.dagenais@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
commit ff140fea84 ("Thermal: handle thermal zone device properly
during system sleep") added PM hook to call thermal zone reset during
sleep. However resetting thermal zone will also clear the passive state
and thus cancel the polling queue which leads the passive cooling device
state not being cleared properly after sleep.
thermal_pm_notify => thermal_zone_device_reset set passive to 0
thermal_zone_trip_update will skip update passive as `old_target ==
instance->target'.
monitor_thermal_zone => thermal_zone_device_set_polling will cancel
tz->poll_queue, so the cooling device state will not be changed
afterwards.
Reported-by: Kame Wang <kamewang@google.com>
Signed-off-by: Wei Wang <wvw@google.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
For SMP systems, thermal worker should use power_efficient_wq in power
saving mode, that will make scheduler more flexible on selecting an active
core for running work handler to avoid keeping work handler always
running on a single core, that will save some power.
Even if 'power_efficient_wq' relevant configs are disabled
'system_freezable_power_efficient_wq' is identical to system_freezable_wq,
behavior is unchanged.
Signed-off-by: Jeson Gao <jeson.gao@unisoc.com>
Signed-off-by: Chunyan Zhang <chunyan.zhang@unisoc.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
This patch fixes use-after-free that was detected by KASAN. The bug is
triggered on a CPUFreq driver module unload by freeing 'cdev' on device
unregister and then using the freed structure during of the cdev's sysfs
data destruction. The solution is to unregister the sysfs at first, then
destroy sysfs data and finally release the cooling device.
Cc: <stable@vger.kernel.org> # v4.17+
Fixes: 8ea229511e ("thermal: Add cooling device's statistics in sysfs")
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
The naming isn't consistent across all sysfs callbacks in the thermal
core, some have a short name like type_show() and others have long names
like thermal_cooling_device_weight_show(). This patch tries to make it
consistent by shortening the name of sysfs callbacks.
Some of the sysfs files are named similarly for both thermal zone and
cooling device (like: type) and to avoid name clash between their
show/store routines, the cooling device specific sysfs callbacks are
prefixed with "cdev_".
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
This extends the sysfs interface for thermal cooling devices and exposes
some pretty useful statistics. These statistics have proven to be quite
useful specially while doing benchmarks related to the task scheduler,
where we want to make sure that nothing has disrupted the test,
specially the cooling device which may have put constraints on the CPUs.
The information exposed here tells us to what extent the CPUs were
constrained by the thermal framework.
The write-only "reset" file is used to reset the statistics.
The read-only "time_in_state_ms" file shows the time (in msec) spent by the
device in the respective cooling states, and it prints one line per
cooling state.
The read-only "total_trans" file shows single positive integer value
showing the total number of cooling state transitions the device has
gone through since the time the cooling device is registered or the time
when statistics were reset last.
The read-only "trans_table" file shows a two dimensional matrix, where
an entry <i,j> (row i, column j) represents the number of transitions
from State_i to State_j.
This is how the directory structure looks like for a single cooling
device:
$ ls -R /sys/class/thermal/cooling_device0/
/sys/class/thermal/cooling_device0/:
cur_state max_state power stats subsystem type uevent
/sys/class/thermal/cooling_device0/power:
autosuspend_delay_ms runtime_active_time runtime_suspended_time
control runtime_status
/sys/class/thermal/cooling_device0/stats:
reset time_in_state_ms total_trans trans_table
This is tested on ARM 64-bit Hisilicon hikey620 board running Ubuntu and
ARM 64-bit Hisilicon hikey960 board running Android.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reorder error handling code in order to fix some resources leaks in some
cases:
- 'tz' would leak if 'thermal_zone_create_device_groups()' fails
- memory allocated by 'thermal_zone_create_device_groups()' would leak
if 'device_register()' fails
With this patch, we now have 2 error handling paths: one before
'device_register()', and one after it.
This is needed because some resources are released in 'thermal_release()'.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Simplify code by using the new 'thermal_zone_destroy_device_groups()'
helper function.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
The critical shutdown notice string used to have some spaces missing,
which makes it not so pretty.
Add the spaces to satisfy usual English space rules.
Reported-by: Mingcong Bai <jeffbai@aosc.io>
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Making thermal_emergency_poweroff static fixes sparse warning:
drivers/thermal/thermal_core.c:6: warning: symbol
'thermal_emergency_poweroff' was not declared. Should it be static?
Fixes: ef1d87e06a ("thermal: core: Add a back up thermal shutdown mechanism")
Acked-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>