Nvidia CEO Jensen Huang said the chipmaker expects to sell $1 trillion worth of Blackwell and Rubin chips by 2027 as demand shifts from training AI models to running them in real-world applications across data centers, software platforms and autonomous vehicles.
At Nvidia’s annual GTC conference on Monday, he declared the dawn of the “Age of Inference,” unveiling new hardware and software aimed at dramatically accelerating how AI models respond to user queries.
Speaking to more than 30,000 attendees at the SAP Center in San Jose, Huang introduced the Nvidia Groq 3 LPX rack, a system designed specifically for inference workloads. The servers combine 72 of Nvidia’s next-generation Vera Rubin systems with 256 language processing units (LPUs) from startup Groq, whose technology Nvidia licensed in a $20 billion deal in December.
The system can generate up to 700 million tokens per second, which Nvidia says is roughly 350 times faster than its previous Hopper-generation GPUs. The new architecture also features far more high-bandwidth memory to address bottlenecks common in inference workloads.
Nvidia also unveiled four new partners for its autonomous driving business: China’s BYD and Geely and Japan’s Nissan and Isuzu. They plan to use Nvidia’s Drive Hyperion platform to develop Level 4 autonomous vehicles, which do not need human help when traversing predefined areas.