Press "Enter" to skip to content

AI Runs on Data: Why Storage Is the Real Infrastructure Bottleneck

The storage shortage is no longer a niche infrastructure issue. It has reached the mainstream markets – the signs are now impossible to ignore. Hard-drive supply has tightened, prices have risen, and major manufacturers have indicated that nearline capacity for 2026 is already effectively committed.

What is still underappreciated is the consequence. The data storage shortage doesn’t just inconvenience data centers; it can slow the development of AI itself. AI runs on compute, but it scales on stored data: training sets, checkpoints, logs, archives, embeddings, synthetic data, and ever-larger volumes of image and video content. It is evident that without abundant, low-cost storage, AI becomes much harder to scale economically.

It is tempting to say AI created this problem. That is only partially true. AI applications are generating enormous volumes of data, and that surge is clearly worsening the data storage shortage. But the underlying imbalance was visible years ago: Data creation and data storage demand were growing faster than the data storage industry was adding affordable capacity. AI accelerated the demand for data storage and experts predicted that with AI evolving, supply would not be able to keep up.

🔍
Meet Sherlock AI
Need more clues? Ask the Sherlock chatbot in the lower right corner to summarize this story, explain technical concepts or answer other questions.

So why not simply build more factories and make more hard drives?

This is a multi-dimensional problem where several issues come into play, ranging from technology to economic to ecological scaling limits. Making more of the same is only a viable option in the short term, it is unsustainable in every aspect in the long run.

Hard disk drives (HDDs) remain the data center workhorses for ‘warm’ and ‘cold’ data that does not need instant access. (In tech parlance, frequently accessed data is considered ‘warm’ while infrequently accessed data is ‘cold.’ This distinction helps organizations balance performance and storage costs.)

As NAND flash memory continues to scale, solid-state drives (SSDs) costs are coming down, but they still carry a steep cost premium. Western Digital’s 2025 data-center storage white paper, citing research from IDC, shows that enterprise SSDs cost roughly 5x to 10x more per terabyte than HDDs. For AI infrastructure, that cost gap remains significant, especially for capacity storage needs. This is where the story turns into a case of vertical market failure.

The HDD market has consolidated, where a small number of suppliers serve a small number of very large buyers, mainly the largest cloud service providers, or hyperscalers. That concentration gives a handful of customers outsized influence over pricing, volume commitments, and product roadmap timing. The result is a market dynamic that can discourage long-term investment, even when end demand is strong, especially if the innovation and investment risks are not appropriately shared.

Vertical market failure

West Oxford Advisors recently described this dynamic as “vertical market failure” in magnetic storage. Its argument is striking: Despite a period of record financial performance, the sector has favored capital returns over breakthrough investment.

Seagate approved a $5 billion share repurchase program in 2025, while Western Digital added $4 billion to its repurchase authorization in February 2026. At the same time, West Oxford argues that storage demand is on track to grow by more than 25% annually through 2030.

Despite slowing HDD Areal Density Capability (ADC) scaling, technological progress is possible. Heat-assisted Magnetic Recording (HAMR) and other roadmap advances show that higher areal density is real, not imaginary.

Seagate reports that its HAMR platform is already qualified and in production with two leading hyperscale cloud providers, with a roadmap targeting hard drives with capacities of up to 100TB. Western Digital has outlined its own path to more than 100TB capacities by 2029. But industrializing these advances is expensive, slow, and risky. That is exactly the kind of investment concentrated markets often tend to underfund.

The result is a widening supply gap. And perversely, the shortage itself can reduce the urgency to solve it. Tight supply supports higher prices and better margins, which can make disciplined output more attractive than aggressive expansion. From the perspective of any single company, that may be rational. From the perspective of the AI economy, it is dangerous as it is likely to hamper growth.

To put it simply: No storage, no AI.

That is not a slogan. It is the hidden constraint behind the current boom. If storage becomes too scarce or too expensive, AI business cases start to weaken. The industry can keep adding GPUs, but without a scalable, affordable way to retain data and make it available within acceptable access times, hundreds of billions in AI investment risk running into very physical limits.

That is why the market is signaling the need for something new: an additional tier in the storage stack. Not a premium flash replacement, but a medium that delivers performance close enough to HDDs for massive volumes of warm and cold AI data, at a materially lower cost at scale, and with a density roadmap designed to evolve over decades rather than product cycles.

Until such a solution exists, the next bottleneck for AI may not be compute, it may be the simple lack of affordable solutions to retain the world’s data.

Author

  • Steffen Hellmond photo

    Steffen Hellmold is president of Cerabyte, Inc. He has more than 25 years of industry experience in product, technology, business and corporate development as well as strategy roles. Previously, he served as senior vice president, Business Development, Data Storage at Twist Bioscience and held executive management positions at Western Digital, Everspin, SandForce, Seagate Technology, Lexar Media/Micron, Samsung Semiconductor, SMART Modular and Fujitsu. Deeply engaged in industry trade associations and standards organizations, he co-founded the DNA Data Storage Alliance in 2020 and the USB Flash Drive Alliance, where he served as president from 2003 to 2007.

    View all posts

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

×