NVIDIA RTX IO: What is it and How Does it Work on Graphics Cards?

October 4, 2020 Matt Mills Hardware, Tips and Tricks 0

One of the technologies that NVIDIA introduced alongside the latest generation of GeForce graphics cards was RTX IO, which will be available for the NVIDIA 20 × 0 and 30 × 0 ranges. Thanks to this technology, the graphics cards of these NVIDIA ranges can access the SSD connected to the PCI Express as if it were memory of the graphics itself with little involvement of the CPU. How does it work?

The RTX IO allows access to the SSD by the GPU regardless of the architecture of the CPU that we have installed in the system, since it is the GPU itself that is responsible for accessing the data from the SSD, a characteristic that before it was not implemented. At least on NVIDIA GPUs, since we had seen a less advanced implementation with AMD‘s HBCC integrated into its Vega graphics.

The RTX IO is the hardware-level implementation of the DirectStorage that Microsoft has implemented in DirectX 12 Ultimate and that it is an API that allows us to access a memory space beyond what is video RAM. And therefore you can request specific data from an SSD connected to a PCI Express port.

Why with NVIDIA RTX IO does the CPU not act on the process?

To understand why the CPU does not act in the process, we have to understand how the GPU accesses the system memory. Any GPU regardless of its architecture can access two different memory wells:

Your local memory, which is the memory included in the graphics card ( VRAM ).
System memory (where the CPU stops its data).

To access the second, it uses one or more DMA units that communicate with the system’s RAM through the PCI Express bus to memory.

Do you remember the systems in SLI and Crossfire where we had two cards in the same system? Well, the mechanism to communicate with an SSD connected to a PCI Express port would be exactly the same.

The reason for this is that despite the fact that we have different PCIe inputs on the PC, at the level of the I / O controller (the Southbridge) all are concentrated in the same controller and this allows all the devices connected to the PCI Express ports to be able to send data to each other, including among other things the SSD (if connected to the same PCIe controller).

The SSD as an extension of the GPU memory

The GPU when accessing the SSD treats it as if it were RAM memory, so in each instruction with access to RAM it makes a request to the specific memory address where the data is located directly. Through a series of completely transparent mechanisms, when data is requested that is on the SSD and not in memory, the data is searched on the SSD and copied directly to a part of the RAM that is used as a cache for the SSD .

This makes it possible to have virtually a large amount of memory available to the system and go beyond the limits of VRAM to access certain data. It only takes the GPU to pre-order them to be copied to RAM.

For example, in an open world game, if we zoom into a new area then the memory can remove the textures and other data that are no longer needed from the RAM and load again from the SSD. For example, the Unreal Engine demo released a few months ago had a “just” 768MB well to copy data from the SSD to the graphics RAM.

Real-time data decompression with a modest CPU

One of the things that comes with RTX IO and found since the RTX 20 × 0 is a real-time data decompression unit.

This unit takes the compressed data as input from the SSD, performs decompression on the fly and sends the already decompressed data to the memory of the graph.

The drive has a data decompression speed that to match using the CPU, which would require a large number of cores to perform the same particular task at the same speed as the RTX IO decompressor.

With this, those from Huang ensure better performance thanks to their NVIDIA RTX IO technology, which will manage to alleviate the performance GAP between very powerful CPUs and much more modest CPUs.