Supermicro Vera Rubin NVL72 Rack Tackles AI Cooling With Radical Liquid Strategy

Craig Nash
By
Craig Nash
Tech writer at All Things Geek. Covers artificial intelligence, semiconductors, and computing hardware.
9 Min Read
Supermicro Vera Rubin NVL72 Rack Tackles AI Cooling With Radical Liquid Strategy

Supermicro’s Vera Rubin NVL72 liquid cooling system represents a fundamental shift in how data centers will cool the next wave of AI infrastructure. The company is demonstrating an upcoming rack-scale system that unifies 72 Rubin GPUs, 36 Vera CPUs, NVIDIA ConnectX-9 SuperNICs, and NVIDIA BlueField-4 DPUs with NVIDIA NVLink 6, all cooled by a proprietary liquid approach that Supermicro claims offers 1,000 times higher electrical impedance than standard cooling fluids.

Key Takeaways

  • The Vera Rubin NVL72 is a rack-scale system delivering 3.6 exaflops NVFP4 performance and 1.4 PB/s HBM4 bandwidth.
  • Supermicro’s new coolant claims 1,000x higher electrical impedance compared to conventional cooling fluids.
  • The system targets up to 10x throughput per watt versus NVIDIA Blackwell, potentially lowering token costs significantly.
  • A smaller 2U HGX Rubin NVL8 variant offers 8-GPU configurations with flexible CPU choices for different workloads.
  • Supermicro is expanding manufacturing and liquid-cooling capabilities to deliver first-to-market systems.

Why Vera Rubin NVL72 Cooling Matters Now

Thermal density in AI racks has become the primary bottleneck for scaling. The Vera Rubin NVL72 liquid cooling system is Supermicro’s answer to a problem that water-cooled systems have struggled with: electrical safety and reliability at extreme power densities. By engineering a coolant with dramatically higher electrical impedance, Supermicro claims to reduce short-circuit risk while maintaining thermal efficiency. This is not merely an incremental improvement—it addresses a genuine constraint that has limited how tightly vendors can pack GPUs and CPUs into a single rack.

The timing matters. NVIDIA’s next-generation Vera and Rubin platforms are arriving at a moment when data centers are desperate for efficiency gains. The NVL72 is positioned to deliver 3.6 exaflops of NVFP4 performance and 1.4 PB/s of HBM4 bandwidth, with 75 TB of fast memory in a single rack. That density demands cooling that works—not in theory, but in production deployments scaling to hundreds or thousands of racks.

Vera Rubin NVL72 Raw Performance and Architecture

The flagship NVL72 is built on Supermicro’s third-generation NVIDIA MGX rack architecture, enhanced with in-row Coolant Distribution Units (CDUs). It scales out using NVIDIA Quantum-X800 InfiniBand and NVIDIA Spectrum-X Ethernet, enabling multi-rack clusters for massive AI training and inference workloads. The system targets up to 10x throughput per watt and one-tenth the token cost versus NVIDIA Blackwell, a generational leap that makes the cooling investment essential.

The Vera CPU itself features 88 custom Arm cores delivering 176 threads, with 1.2 TB/s LPDDR5X memory bandwidth and 1.8 TB/s NVLink-C2C bandwidth to GPUs. Supermicro is also promoting dedicated Vera CPU systems as part of the same portfolio, offering configurations optimized for compute-heavy workloads that do not require the full NVL72 scale.

HGX Rubin NVL8: The Smaller Alternative

Not every deployment needs a full 72-GPU rack. Supermicro’s 2U liquid-cooled HGX Rubin NVL8 delivers 8 GPUs in a compact form factor, targeting AI and HPC workloads that demand flexibility over maximum scale. The system provides 400 petaflops NVFP4, 176 TB/s HBM4 bandwidth, 28.8 TB/s NVLink bandwidth, and 1600 Gb/s NVIDIA ConnectX-9 networking. Critically, the HGX Rubin NVL8 supports both next-generation Intel Xeon and AMD EPYC processors, letting customers choose their CPU architecture without sacrificing GPU performance.

This flexibility positions the NVL8 as a bridge solution for organizations testing Vera Rubin workloads before committing to full-rack deployments. It also serves teams that need dense GPU compute but run mixed CPU-GPU workloads favoring one processor family over another.

Manufacturing and Market Timing

Supermicro is expanding manufacturing and liquid-cooling capabilities specifically to support first-to-market delivery of these systems. The company frames its rack-scale portfolio around Data Center Building Blocks Solutions (DCBBS) and advanced Direct Liquid Cooling (DLC) technology, designed to accelerate customer time-to-market while meeting the thermal, power, and networking demands of next-generation AI infrastructure.

The systems were being revealed around GTC San Jose 2026, positioning Supermicro to capture early adopter demand as NVIDIA Vera Rubin ramps into production. This timing advantage matters in a market where being first to ship can mean months of revenue before competitors catch up.

How Vera Rubin NVL72 Compares to Blackwell

NVIDIA Blackwell remains the current baseline for comparing new AI rack systems. The Vera Rubin NVL72 is claimed to deliver up to 10x throughput per watt and one-tenth the token cost versus Blackwell. These are not marginal gains—they represent the kind of efficiency jump that justifies ripping out existing infrastructure and replacing it with new hardware. However, Blackwell systems are already shipping and battle-tested in production. Vera Rubin is still upcoming, meaning early adopters will be running on bleeding-edge hardware with less operational history.

The cooling innovation is where Supermicro differentiates. A standard Blackwell rack still uses conventional liquid cooling that requires careful electrical isolation and adds operational complexity. The Vera Rubin NVL72’s proprietary coolant, with its claimed 1,000x higher electrical impedance, theoretically simplifies that isolation challenge and allows denser packing.

The Coolant Innovation: Fact vs. Claim

Supermicro’s 1,000x electrical impedance claim is striking, but that this is a company claim rather than an independently verified specification. The brief does not include third-party testing or validation of the coolant’s properties. What matters in practice is whether the coolant performs reliably in production racks handling real AI workloads. Supermicro’s decision to design the entire NVL72 around this coolant suggests internal confidence, but the market will be the ultimate judge.

Liquid cooling itself is not new in data centers. What is new is engineering a coolant formulation that supposedly eliminates or dramatically reduces electrical risk while maintaining thermal performance. If the claim holds up under stress testing, it could reshape cooling standards across the industry.

Is the Vera Rubin NVL72 Worth the Complexity?

Liquid-cooled AI racks are more complex than air-cooled alternatives. They require CDU infrastructure, leak detection, coolant management, and technician training. For organizations running a handful of GPUs, air cooling is simpler. But for data centers deploying hundreds or thousands of Vera Rubin GPUs, the density and efficiency gains from the NVL72 become economically mandatory. The question is not whether liquid cooling is worth it—it is whether Supermicro’s specific approach and coolant innovation deliver on the promise of simplified operations and higher reliability.

When Will Vera Rubin NVL72 Ship?

The systems are described as upcoming, with Supermicro positioning itself for first-to-market delivery as Vera Rubin ramps. No specific shipping dates have been announced, but the GTC San Jose 2026 timeframe suggests availability sometime in 2026. Early customers will likely see units in the second half of the year, with broader availability ramping into 2027.

What about the Vera CPU systems?

Supermicro is also promoting dedicated Vera CPU systems as standalone products, not just as components within the NVL72. These systems target workloads that benefit from the Vera CPU’s 88 Arm cores and specialized memory bandwidth without requiring 72 GPUs. This gives customers a modular approach to Vera architecture, choosing GPU-heavy or CPU-heavy configurations based on their specific workload mix.

The real story here is not just a new rack—it is a complete platform shift. Supermicro’s Vera Rubin NVL72, with its proprietary liquid cooling and aggressive performance claims, signals that the next generation of AI infrastructure will be liquid-cooled by default, not by exception. Whether the 1,000x electrical impedance claim stands up to independent scrutiny will determine whether this becomes an industry standard or a cautionary tale about over-engineering. For now, it is the most ambitious liquid-cooling strategy announced for NVIDIA’s next-generation platforms.

Edited by the All Things Geek team.

Source: Tom's Hardware

Share This Article
Tech writer at All Things Geek. Covers artificial intelligence, semiconductors, and computing hardware.