- Infrastructure processing units (IPUs) are alternatives to DPUs
- These kinds of chips will be key to freeing up CPU capacity as inferencing picks up
- IPUs could soon become relevant for telcos looking to deploy edge compute
You already know GPUs, CPUs and custom chips like Google’s TPUs are the fuel behind AI’s fire. Those playing close attention may also know about LPUs, like those from Groq, which are designed specifically for inferencing workloads at scale. But there’s one more key kind of chip emerging in the AI era: the IPU.
Introduced by Intel in 2021, infrastructure processing units (IPUs) are a type of SmartNIC designed to handle infrastructure and networking functions. Specifically, HyperFRAME Research VP Ron Westfall told Fierce IPUs are critical for “offloading networking, storage and security tasks from CPUs,” freeing them up for other tasks like data pre-processing and AI workflow orchestration.
Intel’s IPUs are alternatives to the data processing units (DPUs) offered by Nvidia and AMD, Westfall added.
What’s new with IPUs?
IPUs came into focus this week when Google and Intel announced an expanded chip collaboration covering not just CPUs but also custom IPUs.
“The modern data center needs to have CPUs to do much of the processing that goes on around the accelerators, and that’s increasingly important as we move to AI inference from training,” J. Gold Associates Founder and Principal Analyst Jack Gold wrote in a note to investors. Hence, the need for IPUs to give CPUs some breathing room.
According to Gold, management of interconnect, storage and power are all critical to cloud efficiency. And in order to optimize efficiency, hyperscalers need IPUs tailored to their specific data center environments.
That’s exactly what Google is planning to work on with Intel.
The telco angle
Why does this matter for telcos? As Gold told Fierce: “Most current telco infrastructure is not hugely compute bound as main functions are not yet really enabled with AI. But that is coming and it will be an issue in the future as core networks become much more compute intense, especially as they start running AI models, like T-Mo[bile] is doing with Live Translate.”
Westfall noted the combination of Intel’s IPUs and Google’s Trainium inferencing platform “can meet emerging telco edge low latency demands related to 5G Open RAN and edge.”
He added the Google/Intel collaboration can also serve as a counter to AWS Nitro and Azure Pensando implementations “to give telcos more hybrid multicloud flexibility and vendor-neutral flexibility.”