The Kioxia Super High IOPS SSD represents a fundamental shift in how AI systems access memory. Announced at NVIDIA GTC 2026, this new storage device uses XL-FLASH memory technology to function as a direct extension to GPU High Bandwidth Memory (HBM), allowing trillion-parameter models to access larger datasets at speeds conventional SSDs cannot match.
Key Takeaways
- Kioxia Super High IOPS SSD uses XL-FLASH Storage Class Memory for 512-byte granular access and sub-microsecond latency
- Supports NVIDIA Storage-Next Architecture and Context Memory Storage (CMX) for GPU-initiated AI workloads
- 25.6TB capacity provides GPU-accessible memory expansion beyond traditional HBM constraints
- Evaluation samples available to select customers by end of 2026; production samples ship Q3 2026
- Delivers over 100 million IOPS with lower power consumption per IO than conventional TLC SSDs
Why GPU Memory Bottlenecks Demand New Storage Architecture
Training and inference on trillion-parameter models hit a hard ceiling: GPU HBM capacity is fixed and expensive. The Kioxia Super High IOPS SSD solves this by letting GPUs directly access flash memory as if it were HBM extension. This is not a workaround. It is a new storage class entirely. XL-FLASH, Kioxia’s SLC-based technology, delivers 512-byte access granularity and dramatically lower latency than traditional TLC SSDs, which typically require larger data blocks and introduce unnecessary delays.
The performance gap matters. Conventional SSDs force the GPU to wait for multi-kilobyte blocks to load. The Kioxia Super High IOPS SSD fetches 512-byte chunks directly, cutting wasted data transfers and reducing power draw per operation. For inference workloads running millions of queries per second, this efficiency compounds into measurable cost savings and throughput gains.
Kioxia Super High IOPS SSD vs. Conventional Storage
Kioxia’s own CM9 Series PCIe 5.0 E3.S SSD offers 25.6TB TLC capacity with 3 DWPD endurance and supports NVIDIA Context Memory Storage (CMX) for large-scale AI inference. The new Super High IOPS SSD outpaces it on latency and IOPS, trading some endurance and capacity for speed. Neither replaces the other—they target different workload profiles. The CM9 suits cost-optimized inference at scale; the Super High IOPS SSD addresses latency-critical training and real-time inference where GPU idle time is unacceptable.
Outside Kioxia’s lineup, competitors designing for NVIDIA Storage-Next face the same architectural constraint: conventional flash cannot match the access patterns GPUs demand. The Super High IOPS SSD’s emulator has demonstrated over 100 million IOPS, a figure no traditional NVMe drive approaches. This is not theoretical. It is measured performance on actual silicon.
Timeline and Availability Reality Check
Evaluation samples reach select customers by end of 2026; full production samples begin shipping in Q3 2026. This is not immediate. Enterprises deploying trillion-parameter models today cannot rely on this drive yet. But the timeline signals Kioxia’s commitment: the company is not announcing vaporware. Samples are being built, tested, and distributed to NVIDIA partners and hyperscalers now.
Makoto Hamada, Senior Director of the SSD Division at KIOXIA, stated: “KIOXIA fully supports the NVIDIA Storage-Next initiative and will deliver purpose-built SSDs to effectively address the need for GPU-accessible memory”. That language reflects a strategic bet. Kioxia is betting that GPU-initiated storage becomes the standard architecture for AI infrastructure, not a niche feature.
What the Kioxia Super High IOPS SSD Means for AI Infrastructure
This drive does not replace HBM. It extends it. For teams training 70B-parameter models or larger, or running inference on models that exceed GPU memory, the Kioxia Super High IOPS SSD removes a critical bottleneck. You get more GPU-accessible memory without redesigning your entire system.
The power efficiency gain is equally important. Data centers running 24/7 inference pay for every watt. Lower power per IO means lower operating costs over the life of the hardware. A 25.6TB drive consuming less power per operation than conventional SSDs compounds into six-figure savings annually at hyperscaler scale.
Should You Expect Kioxia Super High IOPS SSD Pricing?
Kioxia has not announced pricing. Enterprise storage rarely does before samples ship. Expect premium pricing—this is specialized silicon for a specialized workload. But the addressable market is massive: every major cloud provider and AI lab is hunting for ways to scale inference cost-effectively. If the Kioxia Super High IOPS SSD delivers the latency it promises, pricing becomes secondary to availability.
When Will the Kioxia Super High IOPS SSD Actually Ship?
Evaluation samples by end of 2026, production samples Q3 2026. That means early adopters—hyperscalers and NVIDIA partners—get hardware in hand within months. Broader availability depends on manufacturing ramp and demand. Plan for limited supply through 2027.
How Does the Kioxia Super High IOPS SSD Compare to Just Adding More HBM?
HBM is expensive and thermally constrained. A 25.6TB HBM module does not exist and would consume more power than an entire data center rack. The Kioxia Super High IOPS SSD is the economically practical alternative: it trades microseconds of latency for terabytes of capacity and 90% cost savings. For most workloads, that trade is not just acceptable—it is mandatory.
The Kioxia Super High IOPS SSD is not a revolutionary product in isolation. It is a necessary piece of infrastructure for the next generation of AI systems. If you are scaling inference or training models larger than GPU memory, this drive will matter to your roadmap. Evaluation samples arriving by end of 2026 means the waiting period is almost over.
This article was written with AI assistance and editorially reviewed.
Source: Tom's Hardware


