5 Huge Hardware Reveals From NVIDIA GTC 22

If you’re into AI, HPC or digital twins, you’re going to want to see these.

NVIDIA CEO Jensen Huang and his AI-powered avatar, Tiny Jensen. (Source: NVIDIA.)

NVIDIA CEO Jensen Huang and his AI-powered avatar, Tiny Jensen, during the GTC keynote. (Source: NVIDIA.)

NVIDIA is virtually hosting its Spring GTC conference this week, and today CEO Jensen Huang delivered his two-hour keynote presentation.

As usual, Huang covered a lot of ground, from hardware to software to what is clearly his favorite topic, artificial intelligence (AI). Huang marveled at AI’s rapid advancement and awe-inspiring potential, as he does during every GTC keynote.

But for us, the most interesting news was all about hardware. Much of it was AI hardware, to be fair, but there was also plenty for designers and visualization pros to get excited about.

1: Ampere’s Heir

The Hopper-based H100 GPU die. (Source: NVIDIA.)

The Hopper-based H100 GPU die. (Source: NVIDIA.)

Huang revealed the successor to the two-year-old Ampere GPU microarchitecture: Hopper. Named in homage of American computer scientist Grace Hopper, the Hopper microarchitecture will make its debut in the NVIDIA H100 datacenter GPU, which NVIDIA expects to be available in Q3 of this year.

“The NVIDIA H100 is the new engine of the world’s AI infrastructures,” Huang said.

Though we’re still waiting on full details of the Hopper microarchitecture, which NVIDIA will outline in an upcoming whitepaper, we do know a few things about the H100 GPU. Huang revealed that the H100 is built on TSMC’s 4N node and includes 80 billion transistors. It supports PCIe Gen 5 and uses HBM3 memory to achieve a memory bandwidth of 3TB/s.

The NVIDIA H100 will also feature a new transformer engine built for the transformer deep learning model. “The transformer is unquestionably the most important deep learning model ever invented,” Huang said, adding that the new transformer engine will offer speedups of 6x compared to transformer networks running on Ampere-based GPUs. Part of the speedup is due to the H100 now supporting an 8-bit floating point data format (FP8) for mixed-precision processing.

The H100 includes the second-generation of NVIDIA’s Multi-Instance GPU (MIG) technology, allowing the GPU to be partitioned into seven isolated instances, as well as the fourth-generation of NVIDIA’s NVLink GPU connector tech, which will provide 900GB/s of connectivity (50 percent higher than gen three). The newly announced NVLink Switch will extend NVLink to connect up to 256 H100 GPUs.

The H100 will also include confidential computing capabilities, meaning the chip will secure data-in-process against unauthorized access. Finally, the new GPU includes a new instruction set called DPX to accelerate dynamic programming algorithms by up to 7x compared to Ampere-based GPUs.

Hopper confidential computing. (Source: NVIDIA.)

Hopper confidential computing. (Source: NVIDIA.)

NVIDIA will offer the H100 in its fourth-generation DGX system of enterprise AI infrastructure. The new DGX H100 will combine eight H100 GPUs connected by fourth-gen NVLink. 32 DGX H100 systems can be networked with an external NVLink Switch to form what NVIDIA calls a DGX POD, which can be further grouped into DGX SuperPODs.

2: CPU Superchips

Almost a year ago, in April 2021, NVIDIA announced it had gotten into the CPU game with Grace, an Arm-based processor made for AI workloads. NVIDIA hinted at the time that Grace would be combined with a GPU in an integrated module, and Huang announced that module today: the Grace Hopper Superchip.

That’s not all—Huang announced another integrated module: the Grace CPU Superchip, which combines two Grace CPUs via the new NVLink-C2C chip-to-chip interconnect (an extension of NVLink to the chip level, which NVIDIA will be opening up to customers and partners for custom silicon integration).

The Grace CPU Superchip is NVIDIA’s first discrete CPU, and Huang claims it provides “the highest performance, memory bandwidth and NVIDIA software platforms in one chip [that] will shine as the CPU of the world’s AI infrastructure.”

The NVIDIA Grace CPU Superchip. (Source: NVIDIA.)

The NVIDIA Grace CPU Superchip. (Source: NVIDIA.)

The Grace CPU Superchip includes 144 Arm cores and LPDDR5x memory with error correction code (ECC). It has a memory bandwidth of 1TB/s, according to NVIDIA, and the entire chip consumes 500W of power.

NVIDIA expects both the Grace CPU Superchip (2 Grace CPUs) and Grace Hopper Superchip (1 Grace CPU and 1 Hopper GPU) to be available in the first half of 2023. Other system combinations are likely to be forthcoming, as Huang pointed out the many combinations enabled by NVLink-C2C.

3: NVIDIA’s New Supercomputer

Huang was just showing off when he revealed Eos, NVIDIA’s very own supercomputer. Expected to be online “in a few months,” according to Huang, NVIDIA Eos is a DGX SuperPOD comprising 576 DGX H100 systems connected by 360 NVLink Switches and 500 Qunatum-2 InfiniBand Switches, NVIDIA’s high performance computing (HPC) networking platform.

NVIDIA Eos. (Source: NVIDIA.)

NVIDIA Eos. (Source: NVIDIA.)

Compared to a DGX SuperPOD of the same size built with the previous-generation DGX A100, Eos achieves 6x FP8 performance, 3x FP16, and 3x FP64 with performance measured in exa- and peta- floating point operations per second: 18 EFLOPS, 9 EFLOPS, and 275 PFLOPS, respectively.

“We expect Eos to be the fastest AI computer in the world,” Huang said.

While Eos is reserved for NVIDIA’s own cadre of AI researchers, the company sees it as a blueprint for what other organizations can achieve with NVIDIA hardware.

4: Hardware for Omniverse

No NVIDIA GTC would be complete without a generous dose of NVIDIA Omniverse, the company’s new software darling. But this GTC, Huang announced some Omniverse hardware news: NVIDIA OVX, a server system dedicated to large-scale digital twins in the Omniverse platform.

“Industrial digital twins need a new type of purpose-built computer,” Huang said. As DGX is to AI, OVX is to digital twins, he added.

A single OVX server contains eight NVIDIA A40 GPUs, 1TB of system memory, and 16TB of storage. Eight OVX servers will be grouped together to form the OVX computing system, which can scale up to 32 OVX servers in an OVX SuperPOD. Multiple OVX SuperPODs can be combined to simulate the largest of digital twins.

An NVIDIA OVX SuperPOD. (Source: NVIDIA.)

An NVIDIA OVX SuperPOD. (Source: NVIDIA.)

According to NVIDIA, OVX will be available later this year through OEM partners Inspur, Lenovo, and Supermicro.

5: New Ampere Graphics Cards

Though Huang didn’t get to it in his keynote, there’s a new Ampere-based RTX graphics card in the desktop lineup: the RTX A5500, which slots between the top-of-the-line RTX A6000 and first mate RTX A5000.

The NVIDIA RTX A5500 graphics card. (Source: NVIDIA.)

The NVIDIA RTX A5500 graphics card. (Source: NVIDIA.)

The NVIDIA RTX 5500 will have 24GB of GDDR6 ECC memory and four DisplayPort 1.4 outputs to power up to two 8K displays. The PCIe Gen 4 card is compatible with NVLink, Quadro Sync II, and NVIDIA Virtual Workstation (vWS). The graphics card is available as of today from NVIDIA channel partners including PNY, and it will be soon be available in systems from NVIDIA’s OEM partners.

Alongside the new desktop card, NVIDIA announced a few new mobile GPUs: the mobile RTX A5500, A4500, A3000 12GB, A2000 8GB, A1000, and A500. NVIDIA says these cards are up to two times faster than Turing-based cards (Turing was the microarchitecture before Ampere). They will be available in OEM systems starting in Spring 2022.

Written by

Michael Alba

Michael is a senior editor at engineering.com. He covers computer hardware, design software, electronics, and more. Michael holds a degree in Engineering Physics from the University of Alberta.