Fabric Attached Memory, memory that is not RAM or cache in CPU

Fabric Attached Memory

Advances in computer architecture not only bring improvements in the processors, but also in the memories used and many times new types of hardware are created. One of them is Fabric Attached Memory, a type of RAM that is part of the new paradigm of in-memory processing. What is it and what characteristics does this type of memory have?

Before starting, it must be clarified that you will not find the Fabric Attached Memory on any PC on the market at the time of writing this article, even if we are talking about an HEDT workstation. The reason? Simple, FAM is a type of memory that is related to High Performance Computing or HPC. The objective of the development of this type of memory? Exceeding the computing power of the ExaFLOP and at this point the memory architecture of the systems is very important.

What is Fabric Attached Memory?

Fabric Attached Memory

We understand as Fabric Attached Memory (FAM) or FAM a type of memory that can be accessed by one or more processors, which can be of the same type or of a different type. How is it different from conventional memories? In the fact that it can be accessed through a network interface and since the interconnection infrastructures are evolving to the so-called Network on a Chip, it can be said that this memory connected to the interface is key to accelerate processes in the CPUs and GPUs of the future.

When we talk about RAM, we usually think of memory external to the processor, which is mounted on separate chips and is accessed through an interface. Under this definition we can think that we can define as FAM the 3DIC circuits with vertically connected memory, but the FAM, as its name indicates, is memory that is directly connected to what we call “Fabric”. And what do we mean by that name? What is the Northbridge, which is the element that communicates the different processors with each other and these with the RAM.

Well, the Fabric Attached Memory, is in the Northbridge and therefore before the RAM, hence its name.

The Scratchpad Memory Concept

Cache Microscopio

When we talk about Scratchpad Memory we refer to an alternative RAM well, separate from what conventional memory is as far as its addressing is concerned, so this means that every system with a Scratchpad Memory requires two data capture systems. . Of course, we have forgotten to say that the Scratchpad Memory is not usually found outside the processor, but inside the processor. Which has a number of advantages:

  • Programs that run inside the Scratchpad Memory run faster due to the low distance to the processor and with lower power consumption.
  • Due to its proximity to the processor, a cache system is not used to access said memory.

This type of memory has been used for decades and today we can find it in the shader units of GPUs, so they are not a novelty. How is it related to Fabric Attached Memory? Well, the fact that the FAM is a type of Scratchpad Memory, but where the use of a network interface to communicate makes it totally different in its access.

The Fabric Attached Memory is a level prior to the hierarchy with respect to RAM, but its access is done as it is done in a NoC where the different elements work as an interconnected network with the NoC in the central part and each element having your router. That is, to access the FAM it is only necessary to call its network address and this is something that all elements of the system can do.

Memory is the biggest bottleneck for processing

Evolución cuello botella memoria

In the ideal system, the memory would have enough response time so that the processing of the instructions was done at the highest possible speed. Unfortunately, the evolution of memory has not kept pace with the evolution of processors and has become a burden that has made it necessary to find solutions to these problems.

There are two reasons why memory cannot reach speed, but the main one is that we cannot put large amounts of memory inside a processor and therefore we have to put it outside of it on another chip. The second is answered with the following question: what happens to the electrical signals when the wiring distance is increased? Your energy consumption grows. And that is where the Fabric Attached Memory takes all its advantage, since being a memory close to the processing units it can reach high bandwidths without requiring high consumption.

Fabric Attached Memory Multiprocesador

But the Fabric Attached Memory is not only key to communicate elements within the same processor, but different processors with each other, for example, if we have several SoCs that need to communicate regularly, they usually write the data in the RAM shared by all of them so that it is later recovered. from the same RAM by the rest of the processors to continue with the work. With the Fabric Attached Memory, it is not necessary for the processors to access the RAM since the data can be written in the Fabric Attached Memory which is located at one level of the hierarchy between the last level cache of the different processors and the interface. to the RAM of each of them.

The FAM is part of the future in the PC

CPU Chiplets CPU GPU AMD Intel

At the beginning of the article we made the statement that titles this section, but every statement is still an answer to a question that responds to a reason. And this is none other than in conventional PCs the amount of FAM necessary to be used in the processor would be limited. So the best solution is a chiplet-based system where the Northbridge is disconnected from the rest of the system, as is the case in AMD‘s Ryzen 3000 and Ryzen 5000 CPUs.

The FAM by its nature within the memory hierarchy has to have more capacity than the fastest cache but less than the RAM memory. With the Northbridge on a separate chip it is possible to integrate the Fabric Attached Memory into it, but on a 2D chip it is difficult to do so. On the other hand, it is a 3D chip composed of several levels, with the Northbridge of the system on one level and the FAM memory on the others. Thanks to this, a good part of the processes and even threads that execute the processors in parallel and in a coordinated manner are accelerated, thus avoiding the enormous bottleneck associated with traditional RAM.

There are a large number of applications that have been burdened in performance, not due to a lack of processing speed, but due to a lack of memory speed. Placing a memory well closer to the processor alleviates many of these problems and with the leap to chiplet-based processors where a single processor is separated into several pieces and the use of new packaging systems will allow its implementation in the PC.