NVIDIA HGX Grace, servers with up to 12096 Cores and 1 TB of RAM

May 25, 2022 Matt Mills Hardware 0

It was at CES 2022 earlier this year when NVIDIA revealed some of its news for what it presented today, its new servers based on the Grace platform, the latest in HPC and AI. And it is that these servers are going to redefine all the sectors of the market where they enter, since NVIDIA has shown the HGX Grace Data Center models that it plans to include and… With more than 12,000 cores and 1 TB of RAM, they will be the reference to beat.

Several companies will make available to their customers any of the four types of servers that NVIDIA has designed: ASUS, Foxconn, GIGABYTE, OCT, Supermicro, and Wiwynn, where their customization will elevate the four types of Huang models to over a dozen different business servers. They will arrive in a year, 2023, but we already have the heart and maximum configuration of all of them on the table: HGX Grace CPU Superchips and Grace Hopper Superchip.

NVIDIA HGX Grace: the monstrous server with 12,096 Cores

As always the details that are not said are the most important, since they reveal data that the company does not want to show even though they are in front of us. HGX Grace type servers represent the biggest giant leap in this world that is remembered.

They will carry the NVIDIA Grace Superchip CPU, that is, two processors coherently connected via NVLink C2C based on ARM V9 Neoverse and designed for AI and HPC infrastructure. It therefore consists of what NVIDIA calls a CPU-CPU module and has for each PCB that integrates it no less than 144 cores with LPDDR5X memory with up to 1 TB per rack and a bandwidth of no less than 1 TB/s .

All consuming 500 watts and being able to cool by air or water. The most impressive thing here is that NVIDIA allows servers with up to 84 nodes per rack, which is a whopping 12,096 Cores in total. Considering that the HGX Grace is 1.5 times faster than the DGX A100 to begin with, we can already imagine the beast that the green team has created.

HGX Grace Hopper: CPU and GPU on one PCB

This type of server bets on the second option of NVIDIA. And it is that here we have not two CPUs, but CPU and GPU on the same substrate, which communicate again by NVLink C2C in order to have a high-performance coherent memory model that will be interconnected at 900 GB/s being 7 times more faster than the PCIe 5.0 bus.

What NVIDIA has created is the ultimate multitasking server, capable of working with any of the company’s software stacks, be it for HPC, AI or Omniverse , so it’s multifaceted. The scheme to follow is simple here with these HGX Grace Hopper servers, each one will integrate a 4nm Hopper GPU with a Grace CPU, where each one will have its own memory, the first having no less than 80 GB of HBM3 and the second 512 GB of LPDDR5X available.

This added to the total bandwidth would give us 3.5 TB/s with a total consumption of 1000 watts per rack and having the possibility of being cooled by air or water. NVIDIA ensures that 42 nodes per rack can be installed in HGX Grace Hopper.

Server Designs and Your Portfolio

There will be four specific designs depending on the workload that manufacturers may need, where within these four types each one can configure and customize them according to their needs, which leaves a wide range of benefits and prices:

NVIDIA HGX Grace Hopper systems for AI training, inference, and HPC are available with Grace Hopper Superchip and NVIDIA BlueField-3.
NVIDIA HGX Grace Systems for HPC and Supercomputing: Feature a CPU-only design with Grace CPU Superchip, BlueField-3, and NVIDIA GPUs.
NVIDIA OVX systems for digital twins and collaborative workloads feature Grace CPU Superchip, BlueField-3, and NVIDIA.
NVIDIA CGX systems for graphics and cloud gaming feature Grace Superchip CPUs, BlueField-3, and NVIDIA A16 GPUs.

As expected, no prices have been revealed yet, since we are a year away from its official launch, so it is to be expected that over this time NVIDIA will reveal the costs, which will surely not be cheap.