Stages in the Design of New CPUs and GPUs from Intel, AMD and NVIDIA

Everything in this life follows a methodology, especially with a view to developing new products. Which consists of a series of protocols with the aim of reaching a clear result in the shortest precise time. The design of CPUs is not alien to this and that is why we are going to describe the different stages that are followed when designing a CPU.

If we take a CPU from a few years ago and buy it with a new one at first glance they will look the same, but it is not the case, the greater complexity of the newer designs require much longer working hours and with it a more qualified staff. amount. It is true that each new manufacturing node allows to place more transistors, but this also involves more work and given that development times are unforgiving, it is necessary that the design of the new designs follow a series of orderly stages.

It all starts with choosing the manufacturing node

The first stage is not made by the engineers, but is carried out by the managers, who negotiate multi-million dollar contracts with the different foundries and make the different predictions regarding the level of production and costs.

The specifications and costs of the manufacturing node will determine to the architects what will be the area they will have to create the chip, the budget for the number of transistors and the development times they have to create the new CPU designs.

The following is the realization of an outline

Floorplanning is nothing more than a scheme where we indicate the general basic elements that make up a CPU to order them in a space. With general elements we are not talking about putting a CPU core here, the cache here, the GPU there, etc.

But by this we refer to much more basic elements such as combinational circuits such as a multiplexer, logic gates of different types, ALUs, memory buffers, etc. This is done due to the fact that hardware description languages require that the interconnection between the different elements be specified and it is therefore essential to take a pencil and paper, a huge sheet of paper to interconnect the different elements in a scheme.

The first diagram that is created is not the final one, but it is used to organize the different modules of a CPU into two types of blocks:

The former are called fixed blocks, they include the elements that will not undergo design variations and will remain static in terms of their area, due to the fact that their nature allows it.
On the other hand, the versatile blocks are the parts that are going to be changed, especially because the interaction with other parts of the chip requires it or simply because they do not yet know what design is going to be implemented in the final version.

If there is enough space in the Floorplanning design of the processor to improve some elements, it seeks to integrate more advanced solutions for those parts, which are under development in parallel and which we will see in the next section.

3rd of the design stages of a CPU: definition of the ISA

The next stage is the definition of the set of registers and instructions to be used, in the case of CPUs where these are inherited from generation to generation, the implementation may seem easy, but the addition of new instructions and registers brings with it the reorganization of something that by itself is complex enough.

In GPUs it is different, they do not run programs and each new generation means being able to change the nature of the ISA of its shader units in a more versatile way. So if it is necessary to delete instructions it can be done, but in a CPU it is not, to delete an instruction or a register is to break the compatibility with the software.

It is the parts in charge of processing data that are usually of variable size in the design of any type of processor and those that end up going through this design stage. In contrast, fixed function elements do not receive these changes.

4th of the design stages of a CPU: testing in FPGAs

No, we have not reached the final stage, but it is necessary to prove that the design is functional, for this purpose, FPGA boards are usually connected to each other in a huge matrix. Each of the FPGAs has been programmed or rather configured to behave as an element of the overall design of the CPU or GPU that we are designing.

The idea here is not to get full speed, but to check in slow motion that the design is functional. So we are going to have much higher latencies and much lower speeds than the final processor, but what we are interested in is that all the elements work.

5th of the design stages of a CPU: I / O, power and clock

The fourth stage already consists of the creation of the chip is to create the final chip, but we find ourselves with the input problem that we have to design the energy distribution in the circuit on the one hand and on the other that of the clock signals, of which some will be derived from a master clock that will be subdivided and others will have their own clock.

What you do is take the floor plan that we have seen in the previous section and surround it in a square or rectangle divided into quadrants. On the outside of the quadrants, it is assigned a specific energy or I / O function. It is here where the perimeter of a chip is designed and the placement of the different communication interfaces with peripherals and memory is marked.

It is at this stage where all the documentation regarding the distribution of electrical energy is also checked if the design of the different elements allows reaching the clock speeds that were expected to be achieved.

Processor verification

After the design comes the verification of the same. In which the finished design is validated by a totally different team, which looks for design flaws at the physical level that it may have. The knowledge of the validation team is totally different from that of the design.

The reason for this is that they touch very different branches and the validation team is more focused on detecting and correcting errors that occur in the physical implementation of a CPU and not in its architecture, which does not change in origin. An equivalent to the world of construction would be that of a group of workers and contractors who prototype a house or a building and detect errors that affect the performance of the final design and that are not visible in the simulation.