IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
commit 02d438f62c05f0d055ceeedf12a2f8796b258c08 upstream.
This error path return success but it should propagate the negative
error code from devm_clk_get().
Fixes: 6c247393cfdd ("thermal: exynos: Add TMU support for Exynos7 SoC")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210810084413.GA23810@kili
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a052b5118f13febac1bd901fe0b7a807b9d6b51c ]
Fix the following make W=1 kernel build warning:
drivers/thermal/thermal_core.c:1376: warning: expecting prototype for thermal_device_unregister(). Prototype was for thermal_zone_device_unregister() instead
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210517051020.3463536-1-yangyingliang@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit fef05776eb02238dcad8d5514e666a42572c3f32 upstream.
The tz->lock must be hold during the looping over the instances in that
thermal zone. This lock was missing in the governor code since the
beginning, so it's hard to point into a particular commit.
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210422153624.6074-2-lukasz.luba@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 30d24faba0532d6972df79a1bf060601994b5873 ]
We can sometimes get bogus thermal shutdowns on omap4430 at least with
droid4 running idle with a battery charger connected:
thermal thermal_zone0: critical temperature reached (143 C), shutting down
Dumping out the register values shows we can occasionally get a 0x7f value
that is outside the TRM listed values in the ADC conversion table. And then
we get a normal value when reading again after that. Reading the register
multiple times does not seem help avoiding the bogus values as they stay
until the next sample is ready.
Looking at the TRM chapter "18.4.10.2.3 ADC Codes Versus Temperature", we
should have values from 13 to 107 listed with a total of 95 values. But
looking at the omap4430_adc_to_temp array, the values are off, and the
end values are missing. And it seems that the 4430 ADC table is similar
to omap3630 rather than omap4460.
Let's fix the issue by using values based on the omap3630 table and just
ignoring invalid values. Compared to the 4430 TRM, the omap3630 table has
the missing values added while the TRM table only shows every second
value.
Note that sometimes the ADC register values within the valid table can
also be way off for about 1 out of 10 values. But it seems that those
just show about 25 C too low values rather than too high values. So those
do not cause a bogus thermal shutdown.
Fixes: 1a31270e54d7 ("staging: omap-thermal: add OMAP4 data structures")
Cc: Merlijn Wajer <merlijn@wizzup.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200706183338.25622-1-tony@atomide.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bf45ac18b78038e43af3c1a273cae4ab5704d2ce ]
The CPU load values passed to the thermal_power_cpu_get_power
tracepoint are zero for all CPUs, unless, unless the
thermal_power_cpu_limit tracepoint is enabled too:
irq/41-rockchip-98 [000] .... 290.972410: thermal_power_cpu_get_power:
cpus=0000000f freq=1800000 load={{0x0,0x0,0x0,0x0}} dynamic_power=4815
vs
irq/41-rockchip-96 [000] .... 95.773585: thermal_power_cpu_get_power:
cpus=0000000f freq=1800000 load={{0x56,0x64,0x64,0x5e}} dynamic_power=4959
irq/41-rockchip-96 [000] .... 95.773596: thermal_power_cpu_limit:
cpus=0000000f freq=408000 cdev_state=10 power=416
There seems to be no good reason for omitting the CPU load information
depending on another tracepoint. My guess is that the intention was to
check whether thermal_power_cpu_get_power is (still) enabled, however
'load_cpu != NULL' already indicates that it was at least enabled when
cpufreq_get_requested_power() was entered, there seems little gain
from omitting the assignment if the tracepoint was just disabled, so
just remove the check.
Fixes: 6828a4711f99 ("thermal: add trace events to the power allocator governor")
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Javi Merino <javi.merino@kernel.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit eb9aecd90d1a39601e91cd08b90d5fee51d321a6 ]
The index of msr and adcpnp should match the sensor
which belongs to the selected bank in the for loop.
Fixes: b7cf0053738c ("thermal: Add Mediatek thermal driver for mt2701.")
Signed-off-by: Michael Kao <michael.kao@mediatek.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3a31386217628ffe2491695be2db933c25dde785 ]
On r8a7791/koelsch, sometimes the following message is printed during
system suspend:
rcar_thermal e61f0000.thermal: thermal sensor was broken
This happens if the workqueue runs while the device is already
suspended. Fix this by using the freezable system workqueue instead,
cfr. commit 51e20d0e3a60cf46 ("thermal: Prevent polling from happening
during system suspend").
Fixes: e0a5172e9eec7f0d ("thermal: rcar: add interrupt support")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fc7d18cf6a923cde7f5e7ba2c1105bb106d3e29a ]
We print a calibration failure message on -EPROBE_DEFER from
nvmem/qfprom as follows:
[ 3.003090] qcom-tsens 4a9000.thermal-sensor: version: 1.4
[ 3.005376] qcom-tsens 4a9000.thermal-sensor: tsens calibration failed
[ 3.113248] qcom-tsens 4a9000.thermal-sensor: version: 1.4
This confuses people when, in fact, calibration succeeds later when
nvmem/qfprom device is available. Don't print this message on a
-EPROBE_DEFER.
Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit f2c4db1bd80720cd8cb2a5aa220d9bc9f374f04e upstream.
Going primarily by:
https://en.wikipedia.org/wiki/List_of_Intel_Atom_microprocessors
with additional information gleaned from other related pages; notably:
- Bonnell shrink was called Saltwell
- Moorefield is the Merriefield refresh which makes it Airmont
The general naming scheme is: FAM6_ATOM_UARCH_SOCTYPE
for i in `git grep -l FAM6_ATOM` ; do
sed -i -e 's/ATOM_PINEVIEW/ATOM_BONNELL/g' \
-e 's/ATOM_LINCROFT/ATOM_BONNELL_MID/' \
-e 's/ATOM_PENWELL/ATOM_SALTWELL_MID/g' \
-e 's/ATOM_CLOVERVIEW/ATOM_SALTWELL_TABLET/g' \
-e 's/ATOM_CEDARVIEW/ATOM_SALTWELL/g' \
-e 's/ATOM_SILVERMONT1/ATOM_SILVERMONT/g' \
-e 's/ATOM_SILVERMONT2/ATOM_SILVERMONT_X/g' \
-e 's/ATOM_MERRIFIELD/ATOM_SILVERMONT_MID/g' \
-e 's/ATOM_MOOREFIELD/ATOM_AIRMONT_MID/g' \
-e 's/ATOM_DENVERTON/ATOM_GOLDMONT_X/g' \
-e 's/ATOM_GEMINI_LAKE/ATOM_GOLDMONT_PLUS/g' ${i}
done
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: dave.hansen@linux.intel.com
Cc: len.brown@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[bwh: Backported to 4.9:
- Drop changes to CPU IDs that weren't already included
- Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 396ee4d0cd52c13b3f6421b8d324d65da5e7e409 ]
int3400 only pushes the UUID into the firmware when the mode is flipped
to "enable". The current code only exposes the mode flag if the firmware
supports the PASSIVE_1 UUID, which not all machines do. Remove the
restriction.
Signed-off-by: Matthew Garrett <mjg59@google.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 129699bb8c7572106b5bbb2407c2daee4727ccad ]
Changes since V1:
* Use dev_info instead of printk
* Use dev_warn instead of BUG_ON
Previously, sysfs_create_group was called before all initialization had
fully run - specifically, before pci_set_drvdata was called. Since the
sysctl group is visible to userspace as soon as sysfs_create_group
returns, a small window of time existed during which a process could read
from an uninitialized/partially-initialized device.
This commit moves the creation of the sysctl group to after all
initialized is completed. This ensures that it's impossible for
userspace to read from a sysctl file before initialization has fully
completed.
To catch any future regressions, I've added a check to ensure
that proc_thermal_emum_mode is never PROC_THERMAL_NONE when a process
tries to read from a sysctl file. Previously, the aforementioned race
condition could result in the 'else' branch
running while PROC_THERMAL_NONE was set,
leading to a null pointer deference.
Signed-off-by: Aaron Hill <aa1ronham@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 03334ba8b425b2ad275c8f390cf83c7b081c3095 upstream.
Avoid warnings like this:
thermal_hwmon.h:29:1: warning: ‘thermal_remove_hwmon_sysfs’ defined but not used [-Wunused-function]
thermal_remove_hwmon_sysfs(struct thermal_zone_device *tz)
Fixes: 0dd88793aacd ("thermal: hwmon: move hwmon support to single file")
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9d216211fded20fff301d0317af3238d8383634c ]
First correct the edge case to return the last element if we're
outside the range, rather than at the last element, so that
interpolation is not omitted for points between the two last entries in
the table.
Then correct the formula to perform linear interpolation based the two
points surrounding the read ADC value. The indices for temp are kept as
"hi" and "lo" to pair with the adc indices, but there's no requirement
that the temperature is provided in descendent order. mult_frac() is
used to prevent issues with overflowing the int.
Cc: Laxman Dewangan <ldewangan@nvidia.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 964f4843a455d2ffb199512b08be8d5f077c4cac ]
commit ff140fea847e ("Thermal: handle thermal zone device properly
during system sleep") added PM hook to call thermal zone reset during
sleep. However resetting thermal zone will also clear the passive state
and thus cancel the polling queue which leads the passive cooling device
state not being cleared properly after sleep.
thermal_pm_notify => thermal_zone_device_reset set passive to 0
thermal_zone_trip_update will skip update passive as `old_target ==
instance->target'.
monitor_thermal_zone => thermal_zone_device_set_polling will cancel
tz->poll_queue, so the cooling device state will not be changed
afterwards.
Reported-by: Kame Wang <kamewang@google.com>
Signed-off-by: Wei Wang <wvw@google.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 152395fd03d4ce1e535a75cdbf58105e50587611 ]
When thermal zone is in passive mode, disabling its mode from
sysfs is NOT taking effect at all, it is still polling the
temperature of the disabled thermal zone and handling all thermal
trips, it makes user confused. The disabling operation should
disable the thermal zone behavior completely, for both active and
passive mode, this patch clears the passive_delay when thermal
zone is disabled and restores it when it is enabled.
Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c8da6cdef57b459ac0fd5d9d348f8460a575ae90 upstream.
tmu_read() in case of Exynos4210 might return error for out of bound
values. Current code ignores such value, what leads to reporting critical
temperature value. Add proper error code propagation to exynos_get_temp()
function.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
CC: stable@vger.kernel.org # v4.6+
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 88fc6f73fddf64eb507b04f7b2bd01d7291db514 upstream.
When thermal sensor is not yet enabled, reading temperature might return
random value. This might even result in stopping system booting when such
temperature is higher than the critical value. Fix this by checking if TMU
has been actually enabled before reading the temperature.
This change fixes booting of Exynos4210-based board with TMU enabled (for
example Samsung Trats board), which was broken since v4.4 kernel release.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Fixes: 9e4249b40340 ("thermal: exynos: Fix first temperature read after registering sensor")
CC: stable@vger.kernel.org # v4.6+
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cf1ba1d73a33944d8c1a75370a35434bf146b8a7 upstream.
When device boots with T > T_trip_1 and requests interrupt,
the race condition takes place. The interrupt comes before
THERMAL_DEVICE_ENABLED is set. This leads to an attempt to
reading sensor value from irq and disabling the sensor, based on
the data->mode field, which expected to be THERMAL_DEVICE_ENABLED,
but still stays as THERMAL_DEVICE_DISABLED. Afher this issue
sensor is never re-enabled, as the driver state is wrong.
Fix this problem by setting the 'data' members prior to
requesting the interrupts.
Fixes: 37713a1e8e4c ("thermal: imx: implement thermal alarm interrupt handling")
Cc: <stable@vger.kernel.org>
Signed-off-by: Mikhail Lappo <mikhail.lappo@esrlabs.com>
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Acked-by: Dong Aisheng <aisheng.dong@nxp.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 68fd77cf8a4b045594231f07e5fc92e1a34c0a9e upstream.
We get a Kconfig warning when selecting this without also enabling
CONFIG_PCI:
warning: (X86_INTEL_LPSS && INTEL_SOC_DTS_IOSF_CORE
&& SND_SST_IPC_ACPI && MMC_SDHCI_ACPI && PUNIT_ATOM_DEBUG)
selects IOSF_MBI which has unmet direct dependencies (PCI)
This adds a new depedency.
Fixes: 3a2419f865a6 ("Thermal: Intel SoC: DTS thermal use common APIs")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit db2b0332608c8e648ea1e44727d36ad37cdb56cb upstream.
The DT specifies a threshold of 65000, we setup the register with a value in
the temperature resolution for the controller, 64656.
When we reach 64656, the interrupt fires, the interrupt is disabled. Then the
irq thread runs and calls thermal_zone_device_update() which will call in turn
hisi_thermal_get_temp().
The function will look if the temperature decreased, assuming it was more than
65000, but that is not the case because the current temperature is 64656
(because of the rounding when setting the threshold). This condition being
true, we re-enable the interrupt which fires immediately after exiting the irq
thread. That happens again and again until the temperature goes to more than
65000.
Potentially, there is here an interrupt storm if the temperature stabilizes at
this temperature. A very unlikely case but possible.
In any case, it does not make sense to handle dozens of alarm interrupt for
nothing.
Fix this by rounding the threshold value to the controller resolution so the
check against the threshold is consistent with the one set in the controller.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Kevin Wangtao <kevin.wangtao@hisilicon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 48880b979cdc9ef5a70af020f42b8ba1e51dbd34 upstream.
The step and the base temperature are fixed values, we can simplify the
computation by converting the base temperature to milli celsius and use a
pre-computed step value. That saves us a lot of mult + div for nothing at
runtime.
Take also the opportunity to change the function names to be consistent with
the rest of the code.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Kevin Wangtao <kevin.wangtao@hisilicon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2cb4de785c40d4a2132cfc13e63828f5a28c3351 upstream.
The threaded interrupt for the alarm interrupt is requested before the
temperature controller is setup. This one can fire an interrupt immediately
leading to a kernel panic as the sensor data is not initialized.
In order to prevent that, move the threaded irq after the Tsensor is setup.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Kevin Wangtao <kevin.wangtao@hisilicon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c176b10b025acee4dc8f2ab1cd64eb73b5ccef53 upstream.
The interrupt for the temperature threshold is not enabled at the end of the
probe function, enable it after the setup is complete.
On the other side, the irq_enabled is not correctly set as we are checking if
the interrupt is masked where 'yes' means irq_enabled=false.
irq_get_irqchip_state(data->irq, IRQCHIP_STATE_MASKED,
&data->irq_enabled);
As we are always enabling the interrupt, it is pointless to check if
the interrupt is masked or not, just set irq_enabled to 'true'.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Kevin Wangtao <kevin.wangtao@hisilicon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 919054fdfc8adf58c5512fe9872eb53ea0f5525d upstream.
clk_prepare_enable() can fail here and we must check its return value.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Kevin Wangtao <kevin.wangtao@hisilicon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 07209fcf33542c1ff1e29df2dbdf8f29cdaacb10 ]
There is a particular situation when the cooling device is cpufreq and the heat
dissipation is not efficient enough where the temperature increases little by
little until reaching the critical threshold and leading to a SoC reset.
The behavior is reproducible on a hikey6220 with bad heat dissipation (eg.
stacked with other boards).
Running a simple C program doing while(1); for each CPU of the SoC makes the
temperature to reach the passive regulation trip point and ends up to the
maximum allowed temperature followed by a reset.
This issue has been also reported by running the libhugetlbfs test suite.
What is observed is a ping pong between two cpu frequencies, 1.2GHz and 900MHz
while the temperature continues to grow.
It appears the step wise governor calls get_target_state() the first time with
the throttle set to true and the trend to 'raising'. The code selects logically
the next state, so the cpu frequency decreases from 1.2GHz to 900MHz, so far so
good. The temperature decreases immediately but still stays greater than the
trip point, then get_target_state() is called again, this time with the
throttle set to true *and* the trend to 'dropping'. From there the algorithm
assumes we have to step down the state and the cpu frequency jumps back to
1.2GHz. But the temperature is still higher than the trip point, so
get_target_state() is called with throttle=1 and trend='raising' again, we jump
to 900MHz, then get_target_state() is called with throttle=1 and
trend='dropping', we jump to 1.2GHz, etc ... but the temperature does not
stabilizes and continues to increase.
[ 237.922654] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1
[ 237.922678] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1
[ 237.922690] thermal cooling_device0: cur_state=0
[ 237.922701] thermal cooling_device0: old_target=0, target=1
[ 238.026656] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1
[ 238.026680] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=1
[ 238.026694] thermal cooling_device0: cur_state=1
[ 238.026707] thermal cooling_device0: old_target=1, target=0
[ 238.134647] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1
[ 238.134667] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1
[ 238.134679] thermal cooling_device0: cur_state=0
[ 238.134690] thermal cooling_device0: old_target=0, target=1
In this situation the temperature continues to increase while the trend is
oscillating between 'dropping' and 'raising'. We need to keep the current state
untouched if the throttle is set, so the temperature can decrease or a higher
state could be selected, thus preventing this oscillation.
Keeping the next_target untouched when 'throttle' is true at 'dropping' time
fixes the issue.
The following traces show the governor does not change the next state if
trend==2 (dropping) and throttle==1.
[ 2306.127987] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1
[ 2306.128009] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1
[ 2306.128021] thermal cooling_device0: cur_state=0
[ 2306.128031] thermal cooling_device0: old_target=0, target=1
[ 2306.231991] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1
[ 2306.232016] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=1
[ 2306.232030] thermal cooling_device0: cur_state=1
[ 2306.232042] thermal cooling_device0: old_target=1, target=1
[ 2306.335982] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=0,throttle=1
[ 2306.336006] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=0,throttle=1
[ 2306.336021] thermal cooling_device0: cur_state=1
[ 2306.336034] thermal cooling_device0: old_target=1, target=1
[ 2306.439984] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1
[ 2306.440008] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=0
[ 2306.440022] thermal cooling_device0: cur_state=1
[ 2306.440034] thermal cooling_device0: old_target=1, target=0
[ ... ]
After a while, if the temperature continues to increase, the next state becomes
2 which is 720MHz on the hikey. That results in the temperature stabilizing
around the trip point.
[ 2455.831982] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1
[ 2455.832006] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=0
[ 2455.832019] thermal cooling_device0: cur_state=1
[ 2455.832032] thermal cooling_device0: old_target=1, target=1
[ 2455.935985] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=0,throttle=1
[ 2455.936013] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=0,throttle=0
[ 2455.936027] thermal cooling_device0: cur_state=1
[ 2455.936040] thermal cooling_device0: old_target=1, target=1
[ 2456.043984] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=0,throttle=1
[ 2456.044009] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=0,throttle=0
[ 2456.044023] thermal cooling_device0: cur_state=1
[ 2456.044036] thermal cooling_device0: old_target=1, target=1
[ 2456.148001] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1
[ 2456.148028] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1
[ 2456.148042] thermal cooling_device0: cur_state=1
[ 2456.148055] thermal cooling_device0: old_target=1, target=2
[ 2456.252009] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1
[ 2456.252041] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=0
[ 2456.252058] thermal cooling_device0: cur_state=2
[ 2456.252075] thermal cooling_device0: old_target=2, target=1
IOW, this change is needed to keep the state for a cooling device if the
temperature trend is oscillating while the temperature increases slightly.
Without this change, the situation above leads to a catastrophic crash by a
hardware reset on hikey. This issue has been reported to happen on an OMAP
dra7xx also.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Keerthy <j-keerthy@ti.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Tested-by: Keerthy <j-keerthy@ti.com>
Reviewed-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 289d72afddf83440117c35d864bf0c6309c1d011 upstream.
After the lock is dropped, it is possible that the cpufreq_dev gets
freed before we call get_level() and that can cause kernel to crash.
Drop the lock after we are done using the structure.
Fixes: 02373d7c69b4 ("thermal: cpu_cooling: fix lockdep problems in cpu_cooling")
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c592fafbdbb6b1279b76a54722d1465ca77e5bde upstream.
The thermal child device reuses the parent MFD-device device-tree node
when registering a thermal zone, but did not take a reference to the
node.
This leads to a reference imbalance, and potential use-after-free, when
the node reference is dropped by the platform-bus device destructor
(once for the child and later again for the parent).
Fix this by dropping any reference already held to a device-tree node
and getting a reference to the parent's node which will be balanced on
reprobe or on platform-device release, whichever comes first.
Note that simply clearing the of_node pointer on probe errors and on
driver unbind would not allow the use of device-managed resources as
specifically thermal_zone_of_sensor_unregister() claims that a valid
device-tree node pointer is needed during deregistration (even if it
currently does not seem to use it).
Fixes: ec4664b3fd6d ("thermal: max77620: Add thermal driver for reporting junction temp")
Cc: Laxman Dewangan <ldewangan@nvidia.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f37fabb8643eaf8e3b613333a72f683770c85eca upstream.
In the critical sysfs entry the thermal hwmon was returning wrong
temperature to the user-space. It was reporting the temperature of the
first trip point instead of the temperature of critical trip point.
For example:
/sys/class/hwmon/hwmon0/temp1_crit:50000
/sys/class/thermal/thermal_zone0/trip_point_0_temp:50000
/sys/class/thermal/thermal_zone0/trip_point_0_type:active
/sys/class/thermal/thermal_zone0/trip_point_3_temp:120000
/sys/class/thermal/thermal_zone0/trip_point_3_type:critical
Since commit e68b16abd91d ("thermal: add hwmon sysfs I/F") the driver
have been registering a sysfs entry if get_crit_temp() callback was
provided. However when accessed, it was calling get_trip_temp() instead
of the get_crit_temp().
Fixes: e68b16abd91d ("thermal: add hwmon sysfs I/F")
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 3105f234e0aba43e44e277c20f9b32ee8add43d4 replaced module
cpu id table with a cpu feature check, which is logically correct.
But we need the module device table to allow module auto loading.
Cc: stable@vger.kernel.org # 4.8
Fixes:3105f234 thermal/powerclamp: correct cpu support check
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Initial logic for checking CPU match resulted in OR of CPU features
rather than the intended AND.
Updated to use boot_cpu_has macro rather than x86_match_cpu.
In addition, MWAIT is the only required CPU feature for idle
injection to work. Drop other feature requirements since they are
only needed for optimal efficiency.
CC: stable@vger.kernel.org #v4.7
Signed-off-by: Eric Ernst <eric.ernst@linux.intel.com>
Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
On the platforms which has an ACPI companion device associated with
PCH thermal device, read passive trip temperature via ACPI _PSV
control method.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Pull thermal managament updates from Zhang Rui:
- Enhance thermal "userspace" governor to export the reason when a
thermal event is triggered and delivered to user space. From Srinivas
Pandruvada
- Introduce a single TSENS thermal driver for the different versions of
the TSENS IP that exist, on different qcom msm/apq SoCs'. Support for
msm8916, msm8960, msm8974 and msm8996 families is also added. From
Rajendra Nayak
- Introduce hardware-tracked trip points support to the device tree
thermal sensor framework. The framework supports an arbitrary number
of trip points. Whenever the current temperature is changed, the trip
points immediately below and above the current temperature are found,
driver callback is invoked to program the hardware to get notified
when either of the two trip points are triggered. Hardware-tracked
trip points support for rockchip thermal driver is also added at the
same time. From Sascha Hauer, Caesar Wang
- Introduce a new thermal driver, which enables TMU (Thermal Monitor
Unit) on QorIQ platform. From Jia Hongtao
- Introduce a new thermal driver for Maxim MAX77620. From Laxman
Dewangan
- Introduce a new thermal driver for Intel platforms using WhiskeyCove
PMIC. From Bin Gao
- Add mt2701 chip support to MTK thermal driver. From Dawei Chien
- Enhance Tegra thermal driver to enable soctherm node and set
"critical", "hot" trips, for Tegra124, Tegra132, Tegra210. From Wei
Ni
- Add resume support for tango thermal driver. From Marc Gonzalez
- several small fixes and improvements for rockchip, qcom, imx, rcar,
mtk thermal drivers and thermal core code. From Caesar Wang, Keerthy,
Rocky Hao, Wei Yongjun, Peter Robinson, Bui Duc Phuc, Axel Lin, Hugh
Kang
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: (48 commits)
thermal: int3403: Process trip change notification
thermal: int340x: New Interface to read trip and notify
thermal: user_space gov: Add additional information in uevent
thermal: Enhance thermal_zone_device_update for events
arm64: tegra: set hot trips for Tegra210
arm64: tegra: set critical trips for Tegra210
arm64: tegra: add soctherm node for Tegra210
arm64: tegra: set hot trips for Tegra132
arm64: tegra: set critical trips for Tegra132
arm64: tegra: use tegra132-soctherm for Tegra132
arm: tegra: set hot trips for Tegra124
arm: tegra: set critical trips for Tegra124
thermal: tegra: add hw-throttle for Tegra132
thermal: tegra: add hw-throttle function
of: Add bindings of hw throttle for Tegra soctherm
thermal: mtk_thermal: Check return value of devm_thermal_zone_of_sensor_register
thermal: Add Mediatek thermal driver for mt2701.
dt-bindings: thermal: Add binding document for Mediatek thermal controller
thermal: max77620: Add thermal driver for reporting junction temp
thermal: max77620: Add DT binding doc for thermal driver
...
When ACPI sends notification for trip point change re-read trips and
notify thermal core, so that this can be passed to user space thermal
controller.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Separated the code for reading trip points from int340x_thermal_zone_add to
a standalone function int340x_thermal_read_trips. This standlone
interface to read is exported so that int340x drivers can re-read trips
on ACPI notification for trip point change.
Also the appropriate notification events are sent by int340x driver based
on the acpi event using int340x_thermal_zone_device_update().
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Add additional properties:
NAME= Thermal zone type
TEMP= Temperature sample value
TRIP= Violated trip index
EVENT= The notification event (new temperature sample, trip violation
trip changed)
This is the additional information to what kobject_uevent already
provides. So it will not impact existing user spaces.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Added one additional parameter to thermal_zone_device_update() to provide
caller with an optional capability to specify reason.
Currently this event is used by user space governor to trigger different
processing based on event code. Also it saves an additional call to read
temperature when the event is received.
The following events are cuurently defined:
- Unspecified event
- New temperature sample
- Trip point violated
- Trip point changed
- thermal device up and down
- thermal device power capability changed
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Tegra132 use CCROC throttle registers to configure
pulse skiper, set these registers to enable throttle
function for Tegra132.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Tegra soctherm support HW throttle, when the soctherm snesors'
temperature is above the throttle trip point, it will trigger
pulse skiper to tune clocks accroding to the throttle depth.
Add this function for Tegra124 and Tegra210.
Since Tegra132 use different registers to configure pulse skiper,
will support it in next patch.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
devm_thermal_zone_of_sensor_register can fail, so check it's return value.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
This patch adds support for mt2701 chip to mtk_thermal,
and integrate both mt8173 and mt2701 on the same driver.
MT8173 has four banks and five sensors, and MT2701 has
only one bank and three sensors.
Signed-off-by: Dawei Chien <dawei.chien@mediatek.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Maxim Semiconductor Max77620 supports alarm interrupts when
its die temperature crosses 120C and 140C. These threshold
temperatures are not configurable.
Add thermal driver to register PMIC die temperature as thermal
zone sensor and capture the die temperature warning interrupts
to notifying the client.
Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>