This is How CPU and GPU Communicate in Gaming and Renders

Since the dawn of personal computer use, its main way of communicating with us is through a screen, so a few times per second the PC has to generate an image that has to be presented as information to the user. But how do the CPU and GPU communicate so that the latter generates the next image to be displayed on the screen?

CPU GPU Communication: The Ring Buffer

How CPU and GPU Communicate in Gaming and Renders

The CPU generates a list of graphical commands that tells the GPU how to generate the next frame. For each new frame, a new list of commands is created, which we also call Display List, a display list in Spanish.

The application in charge of generating the frame is what we call a graphics API, these APIs are abstractions of what a GPU is at the programming language level that help applications to communicate with the graphics card. Its function is to translate the list of things to do that the application sends it into something that the GPU itself understands and for this, another participant is necessary, which is the controller of the GPU itself or driver, a program that is used to carry out that translation process to the code that our particular GPU can understand.

Once this list has been generated so that the GPU can understand it, it is stored in a part of the system’s RAM. The GPU through a special unit called DMA that allows it to read not only the VRAM but also the system RAM from where it will read the Display List generated by the CPU.

The GPU treats the memory area where the Display List is located as a ring, that is, when it reaches the last memory address assigned to the ring, when it reads the next instruction, the program counter is automatically reset to 0 and so on. In other words, it always goes through the same memory addresses and each complete ring is a frame, so when the GPU command processor reaches the end of the data ring or buffer ring, it starts from zero and with that a new frame begins. .

The GPU Command Processor

The command processor (CP) is the conductor of the GPU, it is always in the central part of every GPU regardless of its architecture and is in charge of controlling the GPU and getting things done in the right order right and with the right resources.

What it does initially is copy the list of commands in the Ring Buffer to a memory close to the command processor so that you can operate with it as quickly as possible. Once this is done, it begins to generate the frame organizing the different elements of the GPU as long as the list of instructions is fulfilled.

The way in which the CP treats the list of commands generated by the CPU is of the FIFO type of English First In – First Out (“First in first in first out”), this means that the element that enters the Queue first will be the first to leave and the last to enter will be the last to leave.

Graphics vs Computing

DX11 vs DX12

With the arrival of DirectX 11, the GPUs began to be able to execute small programs not related to the rendering of the graphics that are called Compute Shaders. The problem in DX11 is that, although we have several different contexts generated by the CPU, but at the GPU level there was only a gigantic ring so the computing tasks to be executed depended on the GPU being able to finish rendering in time , it was from DX12 that it has been possible to use several contexts and thus several rings simultaneously.

Usually two different types of rings are used, one for computation and the other for graphics, it is common to have a single ring for graphics, but if we want to render in stereo, for example, for virtual reality. then we are interested in having two rings of commands for graphics.

colas gpu

In the case of rings for computation, it is different and they are usually composed of several sub-rings and their execution is completely asynchronous with the rendering of the image, so the beginning and end of the computation lists is independent of the beginning and the end of the frame.

When we have computing rings working in parallel to graphics rings then we do not have just one but several command processors but the one that always has preference is the graphics ring unless we find ourselves with a blinded GPU and therefore it is used for tasks other than generating graphics, such as the use of these in scientific computing.