Telemetry on PC: How to Measure Temperature and Hardware Consumption

Telemetry on PC

The hardware that is usually promoted in marketing is memory and processors of all kinds, but a PC is something complex enough that if something goes wrong it ends up completely breaking the system. This is where what we can call telemetry or monitoring systems come in, which control the temperature and energy consumption of the components of your PC.

One of the key points today in terms of semiconductor design is everything related to energy consumption and the color transmitted by the components. Since excessive energy consumption generates excessive heat that can decrease the life of the component or, failing that, render it useless forever.

Temperatura GPU

Another reason is regarding energy consumption, many designs are used are tricks such as separating the energy domain of its different parts, in such a way that when a component is not used, the electrical power supply is turned off and it stops working. . While others are based on decreasing the clock speed if the workload is low and increasing it when it is high.

But for processors to adapt they need real-time information that marks the temperatures and voltage of the different components, to adapt their clock speeds and activate and deactivate the different parts of the hardware, either at the SoC level or at the level. of several components on one board.

What are telemetry systems and where are they located?

PWM AMD Escasez

In reality, telemetry systems are nothing more than small chips that are nothing more than digital thermometers and / or voltmeters, which are responsible for making continuous measurements to the hardware to which they are connected and transferring that information to a series of microcontrollers that From the telemetry obtained by the monitoring systems, they manage clock speeds, voltage, and are even capable of turning off parts of the processor.

As for their location it depends, for example we can find them within the same chip as in the form of external components, depending on the specifications and utility of each type of processor. Actually most SoCs today have various hardware monitoring systems that send the telemetry data to the different microcontrollers.

These are extremely important in SoCs, where the closeness of the components produces what we call thermal choking, which prevents the different parts from their close integration from being able to achieve the same clock speeds as separately, so it is essential that voltage and temperature monitoring systems are inside the SoC.

What is a microcontroller?

A microcontroller is itself a computer on a chip, with a much higher level of integration than a SoC due to the fact that both the processing units and the RAM memory are integrated in the same chip, having only communication with the outside through a series of I / O pins that serve to load the program that will execute recursively.

Microcontrollers began to be used in PCs starting with the 1983 IBM PC XT in which the Intel 8048 managed the 8086, over time they became more complex and were taking care of various background tasks such as management of the power and temperature of the processors.

The reason why microcontrollers are used and not microprocessors is because by not sharing the RAM with the CPU, not only contention in access is avoided, but also that malicious code accesses it. However, firmware updates are loaded from certain addresses in the system RAM before being copied to the RAM of each microcontroller during startup.

An example of a microcontroller for telemetry: the AMD SMUSMU AMD telemetria In many of the diagrams of AMD SoCs, CPUs and GPUs you will have seen a piece named SMU, about which you will have shrugged your shoulders for not knowing what it is and its functionality. If we read the official AMD documentation about what the SMU is, we can find the following statement:

The System Management Unit, or SMU in English, is a subcomponent of the northbridge that is responsible for various energy management tasks during the PC power-up and in full operation of the PC, which includes a microcontroller to assist (in said task).

It must be taken into account that since the appearance of the first x86-64 by AMD what we call northbridge, which is the hardware in charge of communicating the CPU with the RAM of the system, is located inside the processor, so the SMU unit or units are located within the processor itself.

Diagrama LM32

The SMU is not only used by AMD in its CPUs but also in its GPUs and it is a Lattice LM32 microcontroller, which is licensed by AMD and is responsible for managing everything related to energy consumption at all times, the difference is that over time AMD has evolved it and there are several SMUs for the different cores.

Ryzen 5000 Mobile CCCP

For example, in the Ryzen 5000 for laptops, AMD has placed a system management unit to manage the energy consumption of each of the Zen 3 cores of that processor, making each of them have their own energy domain and can fluctuate in clock speed and voltage synchronously or independently with respect to the rest of the cores.

The counterpart in the case of Intel is the so-called Management Engine, whose function is exactly the same. Both AMD and Intel ME have the particularity of having a privilege level above the processor itself, having the ability to stop the CPU dry and the rest of the components if a dangerous situation occurs for the PC.