AI agent token demand may explode far beyond current projections, according to a Goldman Sachs analysis that paints a starkly different picture of AI economics than the industry currently assumes. The investment bank warned that AI agents could drive token consumption up 24 times from 2026 levels by 2030, fundamentally reshaping how enterprises budget for artificial intelligence and how cloud providers price their services.
Key Takeaways
- Goldman Sachs forecasts AI agent token demand could surge 24 times between 2026 and 2030
- Consumer-facing AI agents alone could increase token consumption by 12 times, adding roughly 600 quadrillion tokens monthly by 2030
- Global monthly token consumption may reach approximately 120 quadrillion tokens per month by 2030
- Compute costs are declining 60 to 70 percent annually, while token pricing is falling more slowly, widening provider margins
- Enterprise AI agents could account for over 70 percent of total global token usage by 2040 at peak adoption
Why AI Agent Token Demand Matters Now
The surge in AI agent token demand represents a critical inflection point for enterprise spending. Companies like Uber and Microsoft are already feeling the impact of tokenized billing as usage-based costs climb faster than expected. What makes this shift urgent is that it contradicts the prevailing narrative about AI economics. For years, the industry has focused on declining compute costs—the expense of running chips and accelerators. But token pricing, which measures the cost of processing language and reasoning through AI models, is not falling at the same rate.
This divergence creates a widening gap between what hyperscale cloud providers and large model providers actually pay for compute versus what they charge for tokens. As that gap widens, their margins expand, making AI infrastructure investment more economically defensible than it appeared just months ago. For enterprises, however, the math works differently. Higher token consumption means higher bills, even as the underlying compute becomes cheaper.
Consumer Agents vs. Enterprise Agents: Which Drives Bigger Token Growth?
Goldman Sachs distinguished between two categories of AI agents, each with different token multiplication effects. Consumer-facing AI agents—think chatbots that assist individual users—could increase global token consumption by 12 times, adding approximately 600 quadrillion tokens per month by 2030. That is a massive number, yet it tells only half the story. Enterprise AI agents, deployed within organizations to automate workflows and decision-making, could become the far larger driver of token demand. Goldman Sachs estimated that enterprise agents could account for over 70 percent of total global token usage at peak adoption by 2040, making them the dominant force in shaping AI economics.
The distinction matters because enterprise agents operate differently than consumer tools. They run continuously, process larger volumes of structured data, and make repeated decisions across complex workflows. A single enterprise deployment can generate token usage that dwarfs thousands of consumer interactions. This means companies betting on AI to transform operations will face token bills that scale with their ambition, not just their user base.
What This Means for Pricing and Margins
The Goldman Sachs analysis reveals a fundamental shift in how AI economics may play out. Compute costs—the price of chips and accelerators from suppliers like NVIDIA, AMD, Google, and others—are declining 60 to 70 percent annually. Token pricing, however, is not falling as quickly. This creates an unusual situation: as raw compute becomes cheaper, the markup providers can apply to token pricing grows wider, improving their margins even as they pass on some savings to customers.
Pure-text chatbots may not benefit from this dynamic. Because they are highly commoditized and face intense competition, token prices for simple text generation could fall faster than underlying compute costs, squeezing margins for providers focused on that segment. But for companies operating large AI agent deployments—or providing the infrastructure those agents run on—the economics become increasingly attractive. The token demand surge effectively decouples pricing from the cost of raw compute, allowing providers to sustain profitable operations even as hardware becomes cheaper.
The Enterprise Cost Burden
For enterprises like Uber and Microsoft, the implications are more sobering. Token-based billing means AI costs scale with usage, not just deployment. As AI agents become more capable and autonomous, they make more decisions, process more data, and consume more tokens. What started as a cost-saving initiative—automating customer service, optimizing logistics, improving recommendations—can quickly become a significant line item on the P&L. Enterprises must now choose between limiting agent autonomy to control costs or accepting substantially higher AI spending as a trade-off for operational gains.
This tension is already visible. Companies are beginning to scrutinize their AI spending more carefully, weighing the productivity gains from agents against the rising token costs. Goldman Sachs’ forecast suggests that tension will only intensify as agent adoption accelerates and token consumption climbs.
What Happens to Smaller Providers?
The Goldman Sachs analysis has an implicit winner and loser structure. Large model providers and hyperscale cloud platforms benefit from the widening margin between token prices and compute costs. They can afford to invest in better models, larger training runs, and more sophisticated inference infrastructure. Smaller AI vendors and boutique model providers face pressure. If they cannot achieve the scale needed to negotiate better compute pricing or if they compete primarily on token-efficient models, they will struggle to maintain margins as token prices stabilize or decline.
The shift also favors providers with integrated ecosystems. Microsoft, with its Azure cloud platform and OpenAI partnership, can bundle AI agent capabilities with broader cloud services, absorbing token costs into larger contracts. Standalone AI service providers without that integration have fewer levers to pull.
Is the Goldman Sachs forecast realistic?
Goldman Sachs’ 24-times token demand increase by 2030 assumes rapid AI agent adoption across consumer and enterprise segments. The forecast depends on agents becoming genuinely autonomous—capable of making decisions and taking actions without constant human oversight. If agent adoption stalls or enterprises limit agent autonomy to control costs, actual token consumption could fall short of the forecast. Conversely, if agents become more capable and integrated into critical business processes faster than expected, token demand could exceed these estimates.
How will token pricing evolve as demand increases?
Token pricing may decline as demand surges, but likely not as steeply as compute costs decline. Providers have pricing power because token consumption creates lock-in—enterprises cannot easily switch models once they have built workflows around a specific provider’s agents. Competitive pressure will eventually force some price declines, but the Goldman Sachs analysis suggests providers will maintain healthy margins even as tokens become cheaper.
Should enterprises worry about AI agent costs now?
Yes. The Goldman Sachs forecast signals that enterprises should begin modeling AI spending with token demand in mind, not just upfront deployment costs. Companies planning large-scale agent deployments should negotiate token pricing into contracts, establish usage budgets, and monitor consumption closely. Waiting until agents are fully deployed to address costs could result in expensive surprises.
The Goldman Sachs analysis upends the conventional wisdom that falling compute costs will make AI increasingly affordable. Token demand growth may offset those savings entirely, making AI economics more complex and costly for enterprises even as the technology becomes more powerful. The next three to five years will determine whether AI agents deliver the promised productivity gains or simply become another expensive infrastructure expense that enterprises struggle to justify.
Edited by the All Things Geek team.
Source: Tom's Hardware


