Nvidia’s follow up to the Hopper CPU is bigger, faster and more efficient, but CEO Jensen Huang showed that the real magic comes at scale.
Speaking from the SAP Center in sunny San Jose California, Nvidia CEO Jensen Huang today delivered the keynote for the company’s first in-person GTC conference in five years.
“Welcome to GTC,” Huang opened. “I hope you realize this is not a concert. You have arrived at a developer conference.”
It wouldn’t be a hard mistake for an outsider to make. The packed stadium, normally home to the San Jose Sharks, was an energetic contrast to the virtual keynotes of recent GTCs which Huang delivered from his kitchen. While the atmosphere was fresh, the rapid pace of new Nvidia products and partnerships was familiar. So was the big, two-letter theme that Nvidia has embraced wholeheartedly: AI.
Huang says we’re living in the AI era, and it’s hard to disagree with him. Everything he spoke about in his two hour keynote comes back to those two letters and how they’ll transform just about everything in our lives, from manufacturing to healthcare to transportation to retail and beyond. “One hundred trillion dollars of the worlds’ industries is represented in this room today,” he pointed out to an SAP Center full to the rafters.
The big news of the day was Nvidia Blackwell, the company’s newest GPU architecture. “We created a processor for the AI era,” Huang proclaimed.
Intro to the Nvidia Blackwell GPU architecture
Nvidia Blackwell is the successor to Nvidia’s two-year-old Hopper GPU architecture. Hopper was, and Blackwell is, Nvidia’s data-heavy GPU architecture, a complement to the more graphics-focused Ada Lovelace GPU architecture that powers the company’s desktop and mobile graphics cards.
“Generative AI is the defining technology of our time. Blackwell is the engine to power this new industrial revolution,” Huang said in an Nvidia press release accompanying the announcement.
Named after mathematician David Harold Blackwell, Nvidia’s Blackwell architecture features what the company calls six “revolutionary technologies” that set it up to be the engine of the AI era:
- Blackwell includes 208 billion transistors and is manufactured with a custom 4NP TSMC process that connects two of the largest possible dies into a single GPU with a 10 TB/s chip-to-chip link.
- Blackwell uses a second-generation transformer engine that introduces new 4- and 6-bit floating point formats that can speed up AI inferencing.
- Blackwell uses Nvidia’s new fifth generation NVLink, which can interconnect up to 576 GPUs with a 1.8 TB/s bidirectional throughput.
- Blackwell includes a new reliability, availability and serviceability (RAS) engine to ensure system uptime. “It’s almost as if we shipped with every single chip its own advanced tester,” Huang said.
- Blackwell has new native interface encryption protocols that Nvidia says will “protect AI models and customer data without compromising performance.”
- Blackwell includes a dedicated decompression engine for faster database queries in data analytics and data science.
Blackwell and GB200 NVL72, the “exascale AI supercomputer”
Kicking off the line of Blackwell-based GPUs is the Nvidia B200 Tensor Core GPU. Eight B200 GPUs are combined in Nvidia’s new HGX B200 server board designed to support generative AI platforms.
Huang also announced the Nvidia GB200 Grace Blackwell Superchip, which connects two Nvidia B200 GPUs to the Nvidia Grace CPU over Nvidia’s fifth generation NVLink. GB200-powered systems can be linked with the newly-announced Nvidia Quantum-X800 InfiniBand and Spectrum-X800 Ethernet platforms, which Nvidia says can deliver networking speeds up to 800 GB/s.
But why have one Superchip when you could combine them into a Supersystem? Huang also announced the Nvidia GB200 NVL72 system of 36 GB200 Superchips, describing it as an exascale supercomputer for AI training and inferencing. Nvidia says the NVL72 can provide up to a 30 times performance increase compared to an equivalent amount of Nvidia H100 Tensor Core GPUs for large language model (LLM) inference workloads while reducing the cost and energy consumption by up to 25 times.
Huang was keen to point out the high interest in Blackwell from a range of partners including cloud service providers (CSPs), systems integrators and software developers, boasting that “Blackwell will be the most successful product launch in our history.” Among the biggest names were Amazon Web Services (AWS), Google Cloud, Microsoft Azure, Oracle Cloud Infrastructure, Cisco, Dell, Hewlett Packard Enterprise, Lenovo and Supermicro.
As for engineering software developers, Nvidia says that Cadence, Synopsys and Ansys (which is in the process of being acquired by Synopsys) will use Blackwell processors to accelerate the performance of their simulation software—which fits in with the growing popularity of GPU-based simulation.
We’re live at Nvidia GTC this week—stay tuned for more updates and leave a comment with any questions.