Design & Reuse

As AI scales, so do CPUs

As AI scales, so do CPUs

March 2, 2026 -

With always-on, agent-based systems, hyperscalers are scaling CPUs to maximize performance per watt, rack efficiency and return on capital.

For much of the last decade, the data center conversation has revolved around accelerators. GPUs, TPUs and the like have dominated headlines, investor decks and infrastructure roadmaps as AI training workloads exploded in scale. But as AI moves from model experimentation into scaled-up products, user-facing applications – and increasingly into always-on, agent-based inference – a more profound shift is underway inside hyperscale data centers.

And amid this shift, the CPU’s role is becoming more crucial than ever, not as a legacy holdover, but as the orchestration and data-processing engine that makes modern AI systems viable at scale.

This shift helps explain a striking point from Arm’s recent quarterly earnings: Arm’s data center business is expected to match or surpass its smartphone business within the next few years. For investors, that statement signals more than a growth. It reflects a structural change in how hyperscalers design, deploy and monetize AI infrastructure, and that’s why CPU scalability, efficiency and ease of system integration matter more than ever.

The path to always-on intelligence

Early AI infrastructure was built around sustained, high-intensity workloads: Large-scale  model training and high-throughput inference . In those environments, accelerators understandably took center stage.

That model no longer reflects reality.

As modern AI applications  expand across enterprise platforms and user-facing products, they are increasingly agent-based. These are persistent systems that plan, reason, retrieve information, coordinate actions, and interact continuously with users and services, all while learning through these interactions. 

Agentic AI systems don’t just run models; they orchestrate workflows and process data in real time across databases, web services and application layers. Agents don’t sleep. They schedule, retrieve context, manage memory and coordinate actions continuously.

Practically speaking, this means:

  • Continuous scheduling and coordination
  • Persistent memory access (KV cache, vector databases, context retrieval)
  • Pre- and post-processing around every model invocation
  • Secure, low-latency control paths between heterogeneous components.

Those responsibilities fall squarely on the CPU.

Why this changes CPU demand characteristics

Agentic AI doesn’t just increase CPU importance; it changes CPU demand characteristics.

Instead of brief orchestration bursts around accelerator-heavy workloads, AI systems now spend a greater share of time in CPU-bound activities. These workloads require large numbers of power-efficient cores operating continuously, often within fixed power and cost envelopes.

This is not theoretical. Hyperscalers are scaling CPUs aggressively:

These are structural increases in CPU density — not incremental bumps. They reflect recognition that CPU-led orchestration and data processing are now critical limiting factors in AI data center scalability.

As AI workloads become continuous rather than episodic, core count and efficiency become defining metrics.

The economics of opportunity 

For investors, the implications are fundamentally economic, not technical. Accelerator availability and model scope (e.g., larger, more capable foundation models, increasing parameter counts, multimodality, etc.) are no longer the only limiting factors in AI data centers. Power, cooling and capital efficiency have joined the list as hyperscalers are now operating within fixed energy envelopes and physical rack space constraints, and returns depend on how efficiently infrastructure is utilized. In this environment, maximizing output per rack – not peak performance in isolation – has become the defining metric for sustainable AI growth.

And accelerators alone don’t solve for these constraints. In fact, without sufficient CPU capacity to orchestrate workloads efficiently, expensive AI accelerators can sit idle or underutilized.

Scalable Arm-based CPUs address this problem by enabling hyperscalers to deliver:

  • Always-on inference within fixed power budgets
  • Better accelerator utilization
  • Higher AI output per rack
  • System-level integration rather than bolt-on architectures

That is why CPU scaling and AI economics are now directly linked.

Why this momentum is structural, not cyclical

Independent analysis reinforces that this shift is not a short-term correction but a multi-year architectural realignment. As research from Futurum Group notes, the future of AI infrastructure is moving away from “how much raw compute can we deploy” toward “how intelligently can we orchestrate compute across diverse requirements.”

This evolution favors scalable, power-efficient CPU architectures that can serve as the control layer across heterogeneous systems.

For Arm, this aligns directly with long-standing strengths: scalable architecture, power efficiency and an ecosystem that enables hyperscalers to build custom silicon without fragmenting software.

Arm does not monetize individual AI models or specific accelerator wins; it monetizes the expansion of compute itself, across every new core deployed to support AI workloads.

That distinction matters in a world where core counts are rising structurally.

Setting the stage for tomorrow 

For investors, the takeaway is simple but profound: AI growth is no longer gated solely by accelerators. It is gated by how efficiently systems can be orchestrated, continuously, at scale. That is driving unprecedented demand for high-core-count, power-efficient CPUs, and why Arm’s data center business is accelerating toward becoming the company’s largest growth engine in the coming years.The CPU is as indispensable as it was when it emerged as a singular technology 50 years ago; this time, it sits at the center of the AI data center and the future of innovation.