Performance Limitations of NVMe SSDs in Games

NVMe SSDs provide a number of advantages by themselves in games, especially thanks to the enormous bandwidth they have, which allows you to get your data at speeds so high that you can completely change the level design in games. But are NVMe SSDs really the Holy Grail of gaming performance or are there limitations?

Solid drives promise to increase performance in games, but this statement is not entirely true and there are a number of elements in both the hardware and the software that result in limitations of the SSDs.

Performance Limitations of NVMe SSDs in Games

SSD limitations for gaming

We have made a compilation of the elements that will affect the performance of SSDs in games, since although they will replace the hard disk they also have their limitations and are not the panacea that many advertise.

Little data to transmit from the SSD

Limitaciones SSD pantalla de carga

If we update any of the components of our PC then we will see an increase in performance, even if we install an SSD we will see a reduction in transfer time compared to a conventional hard drive. But does it make good use of the hardware? The reality is, no.

The vast majority of applications run programs from RAM, as the hard drive is too slow in latency and bandwidth to serve as memory. The same happens with NAND Flash memory, but on a much smaller scale, so it is also necessary to copy the data to RAM. The difference? NVMe SSDs are several dozen times faster and can therefore transmit a greater amount of data.

It is the program code that is responsible for managing the sending of data from the disk to the RAM memory, so no matter how fast the transfer speed between the storage unit and the RAM is if the data volume is low no advantage will be taken. For example, in games that relied on data transmission from an optical drive the SSD cannot avoid the minimal existence of loading screens.

Data decompression is not free

Compresor Descompresor

The big problem with SSDs, whether SATA or NVMe, is the fact that storage is much more expensive than a conventional hard drive. So everything points to the implementation of compression and decompression units of data in real time. These drives must be capable of decompressing large amounts of gigabytes of data per second in real time.

If you have ever had to install one of those pirted versions of certain heavyweight programs on the internet, you will see that they usually come extremely compressed and need a lot of CPU power to install. Taking this premise into account, now think about the computational cost of having to decompress that amount of data in a single second.

We are talking about sacrificing several whole cores just for this task and the only way that exists in the future for SSDs to end up catching up with HDDs is to use real-time compression and decompression mechanisms that allow increasing their capacity. That moment has not yet come, but we are sure that as it has happened in consoles, future CPUs from Intel and AMD will incorporate these units.

Limitations in the SSD by consumption

Consumo Energético Torre

The third problem has to do with energy consumption, the PCI Express interface doubles its speed in each generation, but to allow backwards compatibility it maintains the same pins. What does this mean? Well, it consumes more and more and this consumption grows if we increase the bandwidth.

The easiest way to do this would be to double the bandwidth, but doing this almost quadruples what is consumed per bit sent. Every time a new PCI Express standard is invented, the challenge for engineers is not to get double the bandwidth, but to create a compatible interface that keeps consumption at certain levels.

How does this affect NVMe SSDs? Well, in low-power laptops we can find interfaces layered in bandwidth in order to reduce the energy consumption of NVMe data transfer. So once these units replace the hard drives of a lifetime in these computers we will see a lower performance in the NVMe SSD of these computers and therefore in the games that run on those computers have performance limitations despite use an SSD compared to desktop PCs.

SSD memory channels as limitations

PS5 Southbridge Annotated

The fourth of the limitations of SSDs in games has to do with the memory channels between the flash controller and the NVMe chips. As with RAM, the number of memory channels corresponds to the number of PC components that can access the data on the SSD at the same time. So a low number of memory channels means that a part of the copy requests to and from the SSD will have added latency by having to keep waiting.

The number of channels on an SSD corresponds to the number of NVMe chips on the board, so a low-channel flash controller will never perform as well. Taking into account that more chips mean higher costs and that the cost of storage is expensive, we can fall into the error that an NVMe SSD of a certain amount of storage has the same performance as another with the same amount.

One way to optimize the access is to distribute the sequential data in several units chips of the NVMe SSD, if for example we look for the string “1234” then in a 4-channel SSD each figure may go to one of the memory chips, in order to do so. be able to take all the data at the same time. The problem with this system is that it would only be feeding a single client and therefore causing latency to the rest of the elements on the PC.

The GPU is also a bottleneck

VRAM

With the arrival of DirectStorage, the GPU becomes a client of the SSD disk and therefore of its flash controller, this increases the number of requests to the flash controller. So if we take into account the problem of the memory channels of the SSD when there are several requests from the CPU, now imagine if we add it to the GPU, which is the fifth and last of the limitations of the SSD in games.

Many of the SSD units are not intended to feed with the same performance to systems where the CPU and the GPU make requests to the same, only the CPU, and even with that many applications continue to work without problems with four cores, some of them even with two cores. When the average number of cores increases and the GPU is added, then many flash controllers will lose their performance, not because of speed, but because they are not fast enough to handle sending and receiving data.

So it will be necessary to create flash controllers not faster in terms of bandwidth, but with the ability to support more memory channels and be able to handle a large number.