Nvidia Confirms Vera Rubin AI Chips in Full Production, Eyes Big Gains

Nvidia CEO Jensen Huang announced at the Consumer Electronics Show (CES) in Las Vegas that the company’s next generation of AI chips is now in full production, marking a major step in Nvidia’s effort to stay ahead in the fast-growing artificial intelligence market.

Speaking on Monday, Huang said the new chips can deliver up to five times more AI computing power than Nvidia’s previous generation, especially for applications such as chatbots and large AI models used by millions of people every day. The chips, which will officially launch later this year, are already being tested by several AI companies in Nvidia’s labs, even as competition increases from both traditional rivals and large customers building their own chips.

At the heart of the launch is Nvidia’s new Vera Rubin platform, which combines six different Nvidia chips into a powerful system designed for large data centres. The flagship server based on this platform includes 72 graphics processing units and 36 newly designed central processors, all working together to handle massive AI workloads. Huang demonstrated how these systems can be linked into large clusters, or “pods,” with more than 1,000 Vera Rubin chips, dramatically improving the speed and efficiency of generating AI “tokens,” which are the basic building blocks of AI responses.

According to Nvidia, these pods can boost token generation efficiency by as much as ten times. Huang explained that this performance jump comes from a new, proprietary data format used by the chips, which allows major gains even though the number of transistors has increased by only about 1.6 times compared to earlier designs. While Nvidia continues to dominate the market for training AI models, it faces stronger competition in deploying AI services at scale, particularly from companies like Advanced Micro Devices and from customers such as Alphabet’s Google, which is developing its own AI hardware.

Much of Huang’s CES keynote focused on showing how the Vera Rubin platform is designed to handle real-world AI use, including long conversations and complex queries. Nvidia introduced a new feature called context memory storage, which helps chatbots respond faster and more accurately during extended interactions. The company also unveiled a new generation of networking switches that use co-packaged optics, a technology that helps connect thousands of machines efficiently and competes directly with solutions from companies like Broadcom and Cisco.

Nvidia said cloud provider CoreWeave will be among the first to deploy Vera Rubin systems, and expects major technology firms including Microsoft, Amazon, Oracle and Alphabet to follow. Beyond chips, Huang also highlighted new Nvidia software for self-driving cars that can help vehicles decide which path to take while keeping detailed records engineers can later review. He confirmed that Nvidia’s Alpamayo software, previously shown as research, will be released more widely along with the data used to train it, stressing the importance of transparency and trust.

Huang also addressed Nvidia’s recent hiring of talent and technology from startup Groq, saying it would not impact Nvidia’s core business but could lead to new products. At the same time, Nvidia is closely watching demand for its older H200 chips, which remain popular in China, with the company waiting for regulatory approvals to continue shipments.

Source: The Economic Times