The AI Industry May Be Building Too Many Giant Data Centers

The field of artificial intelligence has advanced at an extraordinary pace over the past several years. Larger models, more powerful GPUs, and vast datasets have driven breakthroughs across research and industry. Yet as AI adoption expands across governments and enterprises, a different constraint is beginning to emerge.

The challenge is no longer simply building more capable models but ensuring that the infrastructure needed to run and improve them can scale without overwhelming the energy systems that support them.

Data centers accounted for more than 4% of total electricity consumption in the U.S. in 2024, according to Pew Research, roughly equivalent to Pakistan’s annual electricity demand. That figure is expected to grow 133% by 2030. Globally, electricity use by data centers is projected to rise from roughly 415 terawatt-hours today to nearly 945 terawatt-hours by 2030, the IEA reported.

Across the U.S. and parts of Europe, the expansion of large AI data centers has already begun to trigger pushback from regulators and local communities. These facilities consume enormous amounts of electricity and water, often in regions where power grids are already under strain.

Much of this pressure stems from a simple assumption that has shaped how AI infrastructure is built. As demand for AI grows, the industry’s default response has been to concentrate more compute inside hyperscale data centers.

What is often missing from this debate is a more basic question: whether the industry actually needs to concentrate as much AI compute in hyperscale data centers as it currently does.

Why the hyperscale assumption is breaking down

Much of the infrastructure conversation around AI rests on a simple assumption. If demand for AI grows, the solution is to build more hyperscale data centers. That logic holds for certain parts of the AI lifecycle, such as training large foundation models, which do require tightly coordinated GPU clusters operating at enormous scale.

However, training represents only one stage in the lifecycle of an AI system. Once a model has been trained, a substantial portion of the computational work shifts to what is commonly called post-training. This includes inference, fine-tuning, experimentation, and evaluation, all of which allow models to improve and adapt to real-world tasks.

These workloads remain computationally demanding, but they differ structurally from the tightly synchronized operations required for large training runs.

Treating them as though they must run inside hyperscale data centers leads to an infrastructure model that expands faster than necessary. Premium data center capacity becomes occupied by workloads that could operate elsewhere, increasing pressure to build additional facilities even when much of the demand does not actually require them.

The result is an infrastructure system optimized for one phase of AI development but applied indiscriminately across the entire lifecycle.

Matching compute to energy availability

A more efficient approach begins with a simple observation: AI workloads are not uniform. Training large models will continue to require centralized GPU clusters and specialized infrastructure. But many post-training workloads can operate across more distributed environments.

Inference, testing, and certain reinforcement learning tasks can run on consumer or mid-tier GPUs located outside hyperscale facilities.

When deployed thoughtfully, this compute can be positioned closer to renewable generation, excess electricity capacity, or regions where energy would otherwise go underutilized. Rather than concentrating massive power demand in a small number of metropolitan data center hubs, compute activity becomes geographically more flexible and better aligned with the realities of energy supply.

However, this approach does not eliminate the need for centralized infrastructure, because hyperscale facilities will remain essential for frontier training and certain forms of high-performance computing.

But distributing suitable workloads across a broader set of environments allows existing energy resources to be used more efficiently before additional centralized capacity is built.

The environmental and infrastructure trade-off

Infrastructure decisions made today will shape the environmental footprint of AI for decades. Every new hyperscale data center represents a long-term commitment to electricity demand, cooling systems, and land use.

When those facilities are built to support workloads that could have been distributed elsewhere, the system becomes locked into higher energy consumption and higher emissions over time.

The environmental consequences are already becoming visible. Research suggests that pollution linked to large-scale data center activity could drive as much as $20 billion in annual respiratory-related health costs in the U.S. by 2028, according to Harvard Business Review.

As these impacts become more visible, lawmakers in several states have begun exploring restrictions on new data center construction and reassessing the strain large AI facilities place on local infrastructure.

This stark difference between constructing data centers and the government’s pushback reflects a broader shift in how AI infrastructure is perceived. What began as a technical scaling problem is increasingly being treated as an energy and public infrastructure issue.

Rethinking how AI infrastructure scales

None of this suggests that AI development should slow down. The economic and scientific benefits of these systems are already clear across health care, research, and industry.

The challenge is ensuring that the infrastructure supporting AI evolves in a way that reflects how these systems actually operate in production.

Training large models will continue to require centralized GPU clusters and specialized facilities. But much of the ongoing work involved in improving and operating those systems can run across a broader set of compute environments that better align with available energy resources.

Reducing the environmental impact of AI will not come from endlessly expanding hyperscale data center capacity alone. It will come from making smarter decisions about where different AI workloads run and how existing infrastructure can be used more efficiently.

As AI demand continues to grow, the question is no longer simply how much compute we can build. It is whether we build infrastructure that reflects the realities of both modern AI systems and the energy systems that support them.

Author

Eric Yang

Eric Yang is the co-founder of Gradient, which is building a decentralized, fully distributed AI infrastructure stack. Before founding Gradient, he was a venture associate at Sequoia Capital China.

View all posts