Thermal management on the i.MX8QXP SOC is done by the System Control Unit (SCU). Thermal Monitor Unit (TMU) has a sensor that can measure the die temperature with 1 °C resolution and uses software calibration.

Additionally, the NXP PF8200 PMIC also has a temperature sensor. The PMIC sensor is not very accurate and only senses ranges of temperatures (every 15 °C), starting from 70 °C.

Kernel configuration

You can manage the thermal support through the kernel configuration option:

  • Temperature sensor driver for NXP i.MX SoCs with System Controller (CONFIG_IMX_SC_THERMAL)

This option is enabled as built-in on the default ConnectCore 8X kernel configuration file.

Kernel driver

File Description

drivers/thermal/imx_sc_thermal.c

Thermal driver

Device tree bindings and customization

Definition of the thermal sensor of the SCU

i.MX8QXP device tree
	scu {
		[...]

		tsens: thermal-sensor {
			compatible = "fsl,imx8qxp-sc-thermal";
			tsens-num = <2>;
			#thermal-sensor-cells = <1>;
		};
	};

Definition of the thermal zones, trip points, and cooling devices

i.MX8QXP SOC + ConnectCore 8X device tree
	thermal_zones: thermal-zones {
		cpu-thermal0 {
			polling-delay-passive = <250>;
			polling-delay = <2000>;
			thermal-sensors = <&tsens 355>;
			trips {
				cpu_alert0: trip0 {
					temperature = <85000>;
					hysteresis = <2000>;
					type = "passive";
				};
				cpu_crit0: trip1 {
					temperature = <100000>;
					hysteresis = <2000>;
					type = "critical";
				};
			};
			cooling-maps {
				map0 {
					trip = <&cpu_alert0>;
					cooling-device =
					<&A35_0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
					<&A35_1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
				};
			};
		};

		pmic-thermal0 {
			polling-delay-passive = <250>;
			polling-delay = <2000>;
			thermal-sensors = <&tsens 497>;
			trips {
				pmic_alert0: trip0 {
					temperature = <85000>;
					hysteresis = <2000>;
					type = "passive";
				};
				pmic_crit0: trip1 {
					temperature = <100000>;
					hysteresis = <2000>;
					type = "critical";
				};
			};
			cooling-maps {
				map0 {
					trip = <&pmic_alert0>;
					cooling-device =
						<&A35_0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
						<&A35_1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
				};
			};
		};
	};

Usage

CPU temperature

To check the current temperature of the CPU (thermal zone 0):

~# cat /sys/class/thermal/thermal_zone0/temp
42700

The command returns the temperature in millicelsius.

To check the current temperature of the PMIC (thermal zone 1):

~# cat /sys/class/thermal/thermal_zone0/temp
70000
This sensor is not accurate and returns ranges of temperatures (every 15 °C), starting from 70 °C.

Trip points

A trip point describes a point in the temperature domain at which the system takes an action. This node describes just the point, not the action.

The Linux thermal subsystem establishes several types of trip points:

  • passive: a trip point to enable passive cooling (such as decreasing clock frequency).

  • active: a trip point to enable active cooling (such as activating fans).

  • hot: a trip point to indicate that an emergency temperature threshold has been reached.

  • critical: a trip point where hardware is at risk.

The device tree defines two trip points (the same for the CPU thermal zone and for the PMIC thermal zone):

Trip point type Temperature

Passive

85 °C

Critical

100 °C

The maximum temperature supported by the chip, depends on the thermal grade of the SOC:

  • Commercial: 95 °C

  • Industrial: 105 °C

Check the thermal grade of your SOC at the U-Boot banner during boot.

Passive trip point

When the temperature in the SOC reaches the passive trip point temperature, the SOC generates an interrupt and the driver sends a notification. Other drivers may subscribe to such notifications in order to trigger cooling actions, such as reducing their clock frequency.

On the current BSP, the GPU driver subscribes to the temperature monitor to lower the GPU frequency when the passive trip point is reached. Expect a performance impact on graphical applications when this happens.

Besides subscriptions, devices declared on the device tree as cooling devices and linked to this trip point will take passive actions. This is the case of the Cortex-A35 cores, which reduce their frequency when they reach the passive trip point.

The device tree defines a hysteresis of 2 °C for the passive trip point. This means that only when the die temperature has gone 2 °C below the passive trip point, the system is considered within normal parameters and the cooling actions can be cancelled.

To read the passive trip point parameters:

~# cat /sys/class/thermal/thermal_zone0/trip_point_0_type
passive
~# cat /sys/class/thermal/thermal_zone0/trip_point_0_hyst
2000
~# cat /sys/class/thermal/thermal_zone0/trip_point_0_temp
85000

To set a different temperature for the passive trip point, write the new temperature (in millicelsius) to the trip point temperature descriptor:

~# echo 65000 > /sys/class/thermal/thermal_zone0/trip_point_0_temp

Critical trip point

When the SOC temperature reaches the critical trip point temperature, the SOC generates an interrupt and the driver shuts down the system to prevent damage to the silicon.

To read the critical trip point parameters:

~# cat /sys/class/thermal/thermal_zone0/trip_point_1_type
critical
~# cat /sys/class/thermal/thermal_zone0/trip_point_1_hyst
2000
~# cat /sys/class/thermal/thermal_zone0/trip_point_1_temp
100000

To set a different temperature for the critical trip point, write the new temperature (in millicelsius) to the trip point temperature descriptor:

~# echo 90000 > /sys/class/thermal/thermal_zone0/trip_point_1_temperature