It is well known that SSDs often have quite a few performance problems when they have to work with compressed files , with their performance being drastically reduced. In this article we are going to tell you why this happens, and how compressed files work to cause so many performance problems to our SSDs.
As you know, depending on the file allocation size that we configure when formatting a storage unit (either SSD or not) it influences the way and the performance in which the device manages the files. In the same way, it happens that depending on how the compression has been performed, it will cost more or less for the SSD to manage these files.
With that said, let’s start at the beginning, because to understand what causes these performance issues to SSDs with compressed files , we must first understand how file compression works.
How file compression works
File compression is known as ” Lossless Compression “, or lossless compression. Unlike the “Lossy Compression” or lossy compression used for example for video or music, reducing the size of the files does not mean losing quality. Literally lossy compression reduces the quality of the audio or video file by removing parts of it, and if we did that in a file it would stop working.
Therefore, file compression is always lossless , and its purpose is to group several files into one that occupies less size.
To explain how this compression is performed, which to some may seem like magic, we are going to give a very basic example. Imagine that the files are blocks of colors , and by compressing them what we are doing is putting the same color together in a single block as many times as possible. The ideal is to have only one block of each color, but many times it is not possible. With the image below you can easily understand it.
Obviously these “disappeared” blocks are annotated , so that when we want to unzip the file we will know exactly where they were and what they contained to “restore” them to their original form.
Another example for you to see better, also with color blocks, but with an ideal setting in which only one block of each color would remain. Here we have 10 blocks: 2 blue, 3 red and 5 yellow. When compressing them we would have only three blocks, one of each color and each with an index indicating what was originally there.
Yet another example. Imagine that we have a file whose content is as follows:
bbbbbuuuuuuuuuaaaaaa
That same compressed file would look like this:
b5u9a6
Obviously, the file would be much smaller and take up less space. This is how compression works.
Why do SSDs lose performance with compressed files?
This is where the file allocation size we discussed at the beginning comes into play again. Compressing files means reducing the size of the files and also with an index, a kind of journaling that indicates what the file was like originally before compressing it.
Because of this, SSDs already have problems of their own with small files , so imagine if we are putting in front of them a container with many very small files and that, in addition, to decrypt them they have to be constantly consulting an index, and that only to know what they are and what they have inside. This is the main reason why SSDs lose a lot of performance when they have to handle compressed files.
Take a look at the following ATTO benchmark screenshot, as an example. Here we can see that with small files the SSD has a very low performance, but as the files increase in size, the performance grows a lot.
Another example, with AS SSD File Compression Benchmark, which shows us precisely how the SSD behaves when it has to handle compressed files. We are talking about a PCIe NVMe SSD with a performance of 3200 MB / s of reading and 2800 MB / s of writing, with algorithms that favor better performance with compressed files and, even so, performance is severely penalized.
This is even more noticeable in SATA 3 SSDs, where the interface further limits the bandwidth of the device to manage files.