What Vera Rubin is designed for
Vera Rubin is a rack-scale AI platform built on a custom design that brings together 72 Rubin GPUs and 36 Vera CPUs. The full system incorporates an estimated 1.3 million components sourced from around the world.
The rack weighs nearly two tonnes and includes approximately 1,300 microchips, a notable increase from the 864 chips found in Grace Blackwell. Its modular construction allows individual superchips to be removed from any of the rack’s 18 compute trays for servicing, an approach that simplifies maintenance compared with Blackwell’s soldered board components.
Unlike processors found in consumer electronics, Vera Rubin is aimed squarely at data centres and enterprise AI infrastructure, where clusters of processors operate as a unified computing system. It also marks Nvidia’s first AI platform to rely entirely on liquid cooling, replacing conventional air-based thermal management with direct liquid cooling to handle high-density workloads.
How it advances beyond Grace Blackwell
Grace Blackwell, introduced into production in 2024, significantly expanded the computing capacity achievable within a single rack and became widely adopted across leading cloud providers. Vera Rubin builds on this concept with further gains in efficiency and scale.
Nvidia says the new platform delivers tenfold improvement in performance per watt. Although the overall system power draw is expected to be roughly double that of Grace Blackwell, the computational output generated for each unit of energy is substantially higher - a key metric as data centres confront power limitations.
Why efficiency gains matter
Modern AI applications - especially large language models and multimodal systems - demand enormous computational resources. As deployment expands, electricity usage and cooling requirements increasingly constrain infrastructure growth.
Higher performance per watt can help operators:
-
Reduce the cost associated with individual AI workloads
-
Run additional compute within existing power envelopes
-
Improve overall utilisation of data-centre infrastructure
Potential trade-offs
Greater efficiency does not automatically translate into lower total energy consumption. Despite its efficiency improvements, Vera Rubin’s higher absolute power requirements mean total electricity usage could continue to rise as more racks are deployed.
The shift to full liquid cooling may improve thermal performance but introduces added infrastructure complexity and highlights the growing scale of AI data-centre development. Additionally, history suggests that improved computing efficiency often drives greater adoption - enabling more AI models, more inference tasks and broader use cases, which can ultimately increase aggregate demand.
Why Vera Rubin could be significant
AI workloads, particularly advanced reasoning systems and large-scale generative models, are placing unprecedented demands on data-centre resources. Efficiency improvements such as those targeted by Vera Rubin could lower operational costs, support larger and more capable models, and improve scalability for enterprise AI deployments.
As organisations weigh the energy costs of training and running sophisticated AI systems, a platform capable of delivering substantially more compute per watt could play a central role in shaping the next phase of AI infrastructure evolution.
Newer Articles
- Multiverse Computing Shrinks AI Models With Quantum-Inspired Compression
- China’s AI Push: Low-Cost Innovation, Global Influence and Strategic Tensions
- MuppetVision 3D* May Return in VR