- Red Hat exec Steve Watt said interest in using CPUs for inference is rising
- Bodies like the x86 Ecosystem Advisory Group are working on what architectural changes are needed to make this happen
- Sovereign AI demand in areas with little available land and power are helping drive this push
RED HAT SUMMIT, ATLANTA – GPUs get all the love, but CPUs are poised to play an increasingly important role in AI’s future, Red Hat exec Steve Watt told Fierce at the Red Hat Summit in Atlanta yesterday. Sovereign AI is actually a major driver of that push, he added.
Watt is VP and distinguished engineer in Red Hat’s Office of the CTO. He told Fierce that one of the most interesting things his team is working on is vLLM CPUs for AI inference. While CPUs aren’t trying to be the next GPUs, they are compute generalists that could be used for lighter workloads like inference. They could also be used to run the operating system that future AI agents run in, he said.
CPU utilization rates are notoriously low. But Watt said there’s a theory that being able to run many different autonomous agents could change that utilization story. The part the industry – specifically the Intel and AMD-led x86 Ecosystem Advisory Group – is working on now is sorting out “what would need to change in the operating system and the instruction set architecture to make that more efficient, for you to basically be able to run more agents efficiently on that.”
What is driving interest in CPUs for AI?
A few factors are driving renewed interest in CPUs for AI.
Watt said one of the most common misconceptions floating around is that there’s no shortage of GPUs. That may be true if you’re a hyperscaler or someone working with a large corporate account on a public cloud. But for smaller players, and open-source contributors with individual accounts, the reality is very different. In those cases, it can be very hard to get access to the most recent GPUs, he said. He pointed to regional variations as well.
But there’s one other, unappreciated factor driving CPU interest: power.
Watt noted that in the U.S. there’s plenty of land and power, meaning there’s plenty of room to build new data centers and power plants. The same can not be said of Europe, which is one of the largest addressable markets for sovereign AI.
“The latest classes of accelerators have at least a 10x difference in the amount of power they consume, and they need to be liquid cooled,” Watt said. Not only is it hard to retrofit existing data centers to meet these demands, but the queue for new grid interconnections there can stretch from seven to 10 years.
“So that implies that they’ll have to do more with what they’ve got and the data center infrastructure that they’ve got, which is why I think vLLM CPU is so interesting,” Watt said.
National sovereignty vs. regional sovereignty?
Speaking of sovereign AI in Europe, Watt said geopolitical pressures and the risk of critical data center infrastructure becoming a target for adversaries have created an incentive to eschew national sovereignty for regional sovereignty.
If that happens, the most likely place to plunk new data centers would be in the Nordics. But some of those countries themselves border countries like Russia.
It’s not yet clear how this will all play out, but certainly a trend worth watching.
To learn more about how enterprises are extending their existing infrastructure to leverage AI, download our free report: "AI doesn't replace. It extends: A practical guide to retooling enterprise infrastructure and teams for the AI age."