How the Bus of a GPU Affects the Amount of VRAM

December 11, 2020 Matt Mills Hardware, Tips and Tricks 0

All GPUs need VRAM memory to work, the most used currently being GDDR6 and HBM2e memory, in this post we are going to explain how the memory bus affects the amount of VRAM that is installed in a graphics card. For this we are going to explain the case of both GDDR6 and HBM2.

If you’ve ever wondered how low-end or mid-range graphics cards don’t have as much VRAM memory as high-end ones, all this has a very simple explanation, the memory bus bandwidth affects the number of chips on plate, but if you want a more detailed explanation, then read on.

Bus and amount of VRAM with GDDR6 memories

GDDR6 memory is a dual channel memory , each of the chips has a 32-bit bus that is actually two 16-bit buses that work in parallel and that allow two simultaneous memory accesses. This means that each GDDR6 interface on the GPU must be at least 32-bit, 2 x 16-bit, and organized from 32-bit to 32-bit.

If we have a GPU with a 64-bit bus then we will have two GDDR6 memory chips, if we have one with a 128-bit bus it will be 4 GDDR6 memory chips, 192 bits 6 chips, 256 bits 8 chips, 320 bits 10 chips and 384 bits 12 chips etc,

Obviously as the memory interface widens, more perimeter will occupy the GPU and it will become larger, so if we want to add more capacity in the VRAM we will have to increase the memory bus , which means increasing the number of interfaces and thereby the periphery of the chip.

However, there is a mode called x8 in the case of GDDR6 , which consists of two chips sharing two channels alternately , in such a way that the first chip gets the first 8 transfer bits of each channel and the others the others. 8 bits to the other chip . This technique has long been used in GDDR memory and is a way to increase VRAM capacity without increasing the complexity of the memory interface, but it also does not increase bandwidth.

This mode is used in the memory in the RTX 3090 and allows the NVIDIA card to have 24 GB of memory making use of 24 chips on the board, without needing a 768-bit bus for it, and yes, we have not forgotten that makes use of GDDR6X, but outside of the PAM-4 interface, both GDDR6 and GDDR6X work exactly the same.

Bus and VRAM quantity with HBM memories

These memories, because they are part of a 2.5DIC configuration, with an interposer in the middle and vias through silicon, work differently and may seem somewhat complex.

First of all, we must bear in mind that each HBM interface is 1024 bits , but since it communicates vertically with the interposer, its interface does not occupy the perimeter space that a 1024-bit GDDR6 would occupy . Of course, each interface corresponds to an HBM memory stack, the thing being as follows:

Without interposing the HBM2 memory, it would not be possible since it is the part in charge of routing the signal to the different chips in the stack and the HBM memories are not made up of the same chip but of several different ones in the stack.

The standard HBM uses 4 chips per stack , to communicate with each 1024-bit interface is divided into 8 channels of 128 bits each, assigning 2 channels to each chip in the stack . Currently each memory chip in an HBM stack has a capacity of 2GB , so this is 8GB per stack

Of course, the HBM memory bus can be shortened, for example a few years ago a low-cost type of HBM memory was proposed with a 512-bit interface and therefore with only 2 chips per stack.

The relationship of the memory bus to the rest of the internal components of the GPU

Another relationship in a GPU is between the last level cache of the same with the interfaces with the VRAM memory, since the number of partitions of the L2 grows or decreases according to the width of the memory interface.

The last level cache of a GPU is not only a client of the memory interface, but also of the previous cache levels, some of which are located in the Compute Units and the fixed function units such as the first one is found within of the raster units, tessellation or ROPS. They are also clients of the last level cache, the command processors make use of the L2 cache,

So the memory bus affects the number of last-level cache partitions and these affect the internal configuration of the GPU.