How Does the Set of Registers and Instructions of a CPU Work?

Have you ever wondered what is the reason that a program compiled to work on x86 CPUs for PCs does not work on an ARM CPU for example? What makes a program compatible with a specific processor family and not others? What are registers and what is their function? In this article we explain what the registers and instructions of a CPU are so that you can understand it.

Many times we have talked about processors having a specific set of instructions, or that a set of instructions has been added to a new processor. Let’s see what this means.

Instructions of a CPU

What is an instruction?

An instruction is nothing more than an action that we send to do in a processor. The instructions can be arithmetic operations with different types of data such as floating point, integers, vector, scalar, logical operations, data movement operations, bit movement operations (where a bit is changed position), jump operations, etc. .

Those same instructions are divided into other subtypes depending on where the data is located. For example, some instructions allow to operate with the data found in the registers at that moment, while in other cases we have to mark the memory address where the data is located (direct mode) or the address of the address of memory (indirect mode).

How the CPU reads instructions in binary code

Regardless of which CPU our system is using, all of them read the binary code in a particular way corresponding to its family. What they do is take a certain amount of bits of the binary code that they are executing and interpret their meaning according to the disposition of it. Every instruction is coded in the following way: the first digits correspond to the instruction code and how it is to be executed, and the last bits are the data itself or where the data on which we want to carry out the instruction is found.

The sets of registers and instructions of the CPUs are called ISA (Instruction Set Architecture) and all under the same ISA use the same encoding of the instructions and therefore the same binary code for them.

Statement set relationship with assembly language

Set Instrucciones x86

All processor families have a common assembly language within it, whose instructions have a 1: 1 correlation with the set of registers and instructions of that processor family. In the table above you can see the relationship between the different x86 assembly language instructions with their instruction code, which in the table is expressed in hexadecimal.

Keep in mind that new instructions are continually being added to ISAs, which leads to very new programs that expressly use these new instructions only work on processors that support them. In general, instruction sets are stable over time with little change, but from time to time instructions are introduced for specific markets that either end up as part of the standard or are later discarded.

There is also the case of new instructions that are more efficient than the existing ones, but in which these instructions are not eliminated from the set because there is a large amount of software that depends on them in the market.

RISC vs CISC vs Post-RISC

RISC processors have few instructions so they need to make up for the lack of instructions with more complex ones, but in return they get a higher speed when executing them due to their lightness. CISC processors, on the other hand, have much more complex sets of instructions that require a more complex construction of the hardware, but instead perform these instructions in fewer cycles.

This difference, although controversial in its day, is no longer so due to the fact that since the appearance of the Pentium Pro on the PC we went to the Post-RISC era in which despite the fact that programs use a set of registers and instructions, these are converted into a microcode of simpler instructions in the process, allowing CISC architectures to behave like RISC architectures and achieve high clock speeds using complex instructions.

The registers of a CPU

Registros e instrucciones CPU

Registers are the closest memory to a processor that exists and therefore the fastest; These are very small memories that can be manipulated directly by the processor control unit. They are used to perform all kinds of common tasks and not only to perform arithmetic operations.

The most common registers in a CPU regardless of its ISA are:

  • Accumulator type registers : used for arithmetic operations. Each family has a different number of records of the accumulator type.
  • Memory access registers : contain the memory address of the data we want to access from RAM.
  • Data registers to or from memory : Contain data copied from memory (read) or to be written to a specific memory address (write).
  • General purpose registers : these are memory registers without a specific utility but which serve to store data to be invoked as quickly as possible.
  • Program counter : indicates the next instruction to execute; jump instructions modify them when you do not want to access the next instruction but rather another part of the program. In each complete instruction cycle, the memory address is increased by 1 and is associated with the processor’s address bus.

Some of the CPU registers, such as the program counter register, which indicates what is the next memory address the processor points to, are found in all CPUs and other types of processors with the ability to execute programs while that other records are unique to each set of records and instructions, making 1: 1 correlation between different ISAs nearly impossible.

Even if we had a 1: 1 converter of the instruction code, we would still have problems because, although two processors may have the same addition instruction, we can find that the way to use the registers and the registers they use are different and that there are even registers found in one family and not in others. An example of these difficulties was encountered by both Microsoft and Qualcomm when adapting Windows 10 to ARM so that all x86 applications would run smoothly on an ARM CPU.

However, there are solutions such as using instruction translation software. Said software transfers the binary code to an intermediate code and then transfers it to the binary code of the target processor in which we want to run the application. Obviously this process is much slower and it is only recommended to run very old software from families of non-existent processors on the market.