SOC THERMAL AM62 DRAFT

From Variscite Wiki
Revision as of 08:30, 8 May 2023 by Alifer (talk | contribs) (Created page with "<!-- Set release according to "release" parameter in URL and use am62-yocto-dunfell-5.10.168_08.06.00.42-v1.0 as default --> {{INIT_RELEASE_PARAM|am62-yocto-dunfell-5.10.168_0...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Warning: This page is designed to be used with a 'release' URL parameter.

This page is using the default release am62-yocto-dunfell-5.10.168_08.06.00.42-v1.0.
To view this page for a specific Variscite SoM and software release, please follow these steps:

  1. Visit variwiki.com
  2. Select your SoM
  3. Select the software release
VAR-SOM-AM62 SoC Thermal Temperature

Introduction

The CPU temperature plays a crucial role in the optimal functioning and reliability of a computer system. As the CPU is the primary processing unit of a computer, it generates a significant amount of heat while executing tasks.
The temperature of the CPU must be within a safe range to ensure that the system operates correctly, without any performance degradation or hardware damage. Therefore, monitoring and managing the CPU temperature is essential to maintaining a stable and efficient computer system.

Temperature management

Thermal management is an essential aspect of ensuring optimal performance of the processor. Although the system has software mechanisms to manage temperature, a good thermal design can improve the dissipation of the heat generated by the processor. Here are some extra components that can be used for temperature management:

  • Fan: fans are used to improve airflow and dissipate heat. When using a fan, it's essential to ensure that it's the appropriate size and has sufficient airflow to remove heat effectively.
  • Thermal Interface: thermal interfaces are used to fill gaps between the processor and heat sink to improve thermal conductivity. There are many types, such as thermal paste, thermal pads, and thermal tapes.
  • Heat Sink: A heat sink is an essential component in a thermal design. It typically consists of a plate that has good thermal conductivity with a large surface area that is in contact with the device that generates heat, for instance, the processor. The heat is then transferred from the device to the heatsink.
NOTE:

Variscite provides the heat sink on the accessories page:

VAR-SOM-AM62 Heatsink

Voltage and Thermal Manager (VTM)

The VTM module on the AM62x supports voltage and thermal management of the device by providing control of on-chip temperature sensors. The device supports a single VTM module, VTM0, which is located in the WKUP domain. VTM0 has two associated temperature monitors, Temp_Sensor_Main_0, and Temp_Sensor_Main_1, each of which is located near hotspots in the device die.
The VTM system provides the control, status, and interrupt generation for the various independent temperature sensors located at different hotspots on the SoC. This allows the kernel to take actions based on thermal events configured via the kernel’s device tree.

Linux kernel

In the User space, the current CPU's temperature sensor can be read as the following:

root@am62x-var-som:~# cat /sys/devices/virtual/thermal/thermal_zone*/temp


Additionally, there are two thermal trip points that can be configured to control the behavior of the CPU based on temperature changes:

The first trip point, also known as the passive trip point, is set/read in:

root@am62x-var-som:~# cat /sys/devices/virtual/thermal/thermal_zone0/trip_point_0_temp

The passive trip point refers to the temperature at which the kernel starts to reduce the performance of the SoC to prevent it from overheating. This is done by lowering the CPU frequency and voltage to reduce the heat generated by the processor. Once the temperature drops 10 °C below the passive trip point, the CPU frequency returns to its normal value.

The second trip point, the critical trip point, is read/set in:

root@am62x-var-som:~# cat /sys/devices/virtual/thermal/thermal_zone0/trip_point_1_temp

The critical trip point is the temperature threshold at which the kernel will initiate an emergency shutdown to prevent damage to the processor due to overheating. This is a last-resort mechanism to protect the system from damaging itself.

Defining the default trip points values via Linux kernel device tree

VAR-SOM-AM62 device tree

CAUTION:

Be very careful when changing the thermal trip points of your SoC manually. It is essential to be careful not to set the trip points of a higher-grade CPU on a lower-grade CPU to avoid damage. Note that changing the thermal trip points can have serious consequences if not done properly. It is important to understand the thermal characteristics of your device and to make careful modifications to the trip points to avoid causing damage or instability.

The am62x-var-som temperature trip points can be changed by editing the following nodes in its device tree:

thermal_zones: thermal-zones {
	main0_thermal: main0-thermal {
		polling-delay-passive = <250>; /* milliseconds */
		polling-delay = <500>; /* milliseconds */
		thermal-sensors = <&wkup_vtm0 0>;

		trips {
			main0_crit: main0-crit {
				temperature = <105000>; /* milliCelsius */
				hysteresis = <2000>; /* milliCelsius */
				type = "critical";
			};

			main0_alert: main0-alert {
				temperature = <85000>; /* milliCelsius */
				hysteresis = <2000>; /* milliCelsius */
				type = "passive";
			};
		};
	};

	main1_thermal: main1-thermal {
		polling-delay-passive = <250>; /* milliseconds */
		polling-delay = <500>; /* milliseconds */
		thermal-sensors = <&wkup_vtm0 1>;

		trips {
			main1_crit: main1-crit {
				temperature = <105000>; /* milliCelsius */
				hysteresis = <2000>; /* milliCelsius */
				type = "critical";
			};

			main1_alert: main1-alert {
				temperature = <85000>; /* milliCelsius */
				hysteresis = <2000>; /* milliCelsius */
				type = "passive";
			};
		};
	};
};

This node will instruct the kernel to periodically poll this temperature sensor and to shutdown the SoC once it has exceeded 105 degrees Celsius. And will reduce CPU frequency using DFS by cpufreq-dt driver at 85 degrees Celsius.

powertop package

In addition, to identify which processes are consuming more power, the powertop software package can be used. It is a Linux tool used to diagnose issues with power consumption and power management

To install powertop, add the following line to the conf/local.conf file in your Yocto build:

IMAGE_INSTALL_append = "powertop"