The AI hard drive shortage is strangling internet archiving. While AI companies snap up every enterprise-grade hard drive rolling off production lines, organizations like the Internet Archive are getting priced out of the market entirely, facing price hikes of 50-100% and lead times stretching from weeks to months.
Key Takeaways
- Enterprise hard drive prices have doubled: from $15/TB in early 2025 to over $30/TB by Q1 2026.
- Internet Archive purchased over 100,000 drives in the past year but faces severe shortages and extended delivery times.
- AI data centers prioritize bulk purchases, sidelining non-profit archival organizations.
- Wikimedia Foundation unable to procure 20 petabytes of storage due to supply constraints.
- Lead times for enterprise drives extended to 20-30 weeks from pre-2025 baseline of 4-6 weeks.
How the AI hard drive shortage is squeezing archivists
The Internet Archive operates the Wayback Machine, a digital time capsule storing over 85 petabytes of web data as of April 2026. Maintaining and expanding this collection requires roughly 1 million new hard drive slots annually. But the AI hard drive shortage has made that expansion nearly impossible. Brewster Kahle, founder of the Internet Archive, told researchers: “We’re getting squeezed out of the market entirely. AI companies are buying up every drive that comes off the line.” This is not hyperbole—data centers running OpenAI, Google, and Meta’s AI infrastructure are hoarding nearline drives designed for the exact use case that archivists need: high-capacity storage accessed frequently enough to justify mechanical drives over tape.
The numbers are brutal. A 20TB enterprise hard drive cost $200-250 in early 2025. By mid-2026, the same drive costs $400-500 retail, with gray market pricing pushing past $600 for drives with 4-6 week shipping. Larger 30TB+ models command $700-900 per terabyte equivalent, with lead times exceeding six months in the U.S. and Europe. Asian suppliers have better availability but export restrictions limit access for Western institutions.
The Wayback Machine faces a two-front crisis
The AI hard drive shortage arrives as the Wayback Machine battles a second threat: publishers blocking its crawler. Over 23 major news organizations have implemented blocks since late 2025, preventing the Wayback Machine from archiving their content in the first place. Mark Graham, director of the Wayback Machine, framed the compounding problem starkly: “This is the second AI threat to the Wayback Machine in six months—first publishers blocking our crawler, now we can’t even store what we do capture.” The combination is devastating. Fewer sites allow archiving, yet the infrastructure to store what does get captured is becoming unaffordable.
Wikimedia Foundation faces similar pressure. The organization reported being unable to procure 20 petabytes of additional storage due to shortages, forcing delays to wiki dumps and backup operations. An anonymous Wikimedia storage engineer explained the ripple effect: “Component shortages from AI are rippling through everything digital preservation touches.” This is not limited to major institutions. University labs and hobbyist archivists are paying 2x premiums for 20TB+ drives from Seagate and Western Digital, making independent archival projects economically unsustainable.
Why alternatives to hard drives fall short
Archivists cannot simply switch to other storage technologies. Solid-state drives cost roughly $0.05 per gigabyte compared to $0.002 per gigabyte for enterprise hard drives—a 25x premium that makes SSDs prohibitively expensive for petabyte-scale operations. Tape storage, specifically LTO-9 cartridges, offers better economics at roughly $20 per terabyte but trades cost for speed. Tape requires sequential access, making it unsuitable for the Wayback Machine’s frequent queries from researchers and journalists who need rapid retrieval.
Cloud archiving services like AWS S3 Glacier or Google Coldline appear cheaper upfront but cost $5-10 per terabyte per month in long-term fees—roughly 10 times more expensive than owning hard drives outright. Worse, cloud services reduce institutional control. Non-profits like the Internet Archive prefer owning infrastructure to ensure permanent access to their data without dependency on commercial providers’ pricing or terms changes. The AI hard drive shortage has eliminated the economic advantage of on-premises storage, leaving archivists trapped between unaffordable hardware, unsuitable alternatives, and expensive cloud services.
When will the shortage end?
Western Digital’s Q1 2026 earnings guidance suggests the AI hard drive shortage will persist through 2027. Major suppliers—Seagate with its Exos series, Western Digital’s Ultrastar line, and Toshiba—have extended lead times across all capacity tiers. Manufacturing capacity expansion typically takes 18-24 months, meaning even if vendors break ground on new fabs today, relief remains years away. The Internet Archive and similar organizations face a choice: halt expansion, accept unsustainable costs, or find workarounds like recycling older drives and negotiating bulk partnerships with manufacturers.
Can the Internet Archive survive the shortage?
The Internet Archive is not helpless. The organization has negotiated partnerships with drive manufacturers and is exploring drive recycling programs to extend hardware lifespan. But these are stopgap measures. Annual growth of 10-15 petabytes requires fresh drives; recycled hardware cannot keep pace with the exponential expansion of the web itself. If the AI hard drive shortage persists through 2027 as vendors predict, the Wayback Machine will face a critical choice: reduce archival scope, raise prices for institutional access, or seek government funding to subsidize storage infrastructure as a public good.
Is the Wayback Machine at risk of shutting down?
The Internet Archive is not facing imminent collapse, but expansion is severely constrained. The organization can maintain existing archives with recycled hardware and strategic partnerships, but capturing new web content at current rates is unsustainable without hardware cost relief. This creates a gap: the web continues to grow, but the Wayback Machine’s ability to preserve it shrinks relative to demand.
What storage alternatives exist for archivists?
Tape storage (LTO-9 at $20/TB) works for cold archival but not for frequent access. Cloud services cost 10x more long-term. Hard drives remain the only economically viable option for the Wayback Machine’s use case, which is precisely why the AI hard drive shortage is so damaging. No good alternative exists at scale.
When will hard drive prices drop?
Industry projections suggest relief through 2027 at earliest, once AI data center buildouts plateau and manufacturing capacity catches up. Until then, expect enterprise drives to remain at 2-3x 2024 pricing levels, creating sustained pressure on archival budgets globally.
The AI hard drive shortage exposes a fragile dependency: internet archiving relies on commodity hardware priced for data center profit margins. When AI demand spikes, archivists lose. This crisis demands policy attention. Digital preservation is a public good, yet it competes in markets designed for commercial cloud providers. Without intervention—whether through government procurement programs, manufacturer partnerships, or new storage technologies—the Wayback Machine will preserve less of the web, not more.
This article was written with AI assistance and editorially reviewed.
Source: TechRadar


