OpenAI and Cerebras have expanded inference capacity through a partnership focused on faster, real-time AI responses with a phased rollout planned through 2028.

OpenAI has partnered with Cerebras to add 750MW of ultra low-latency AI compute to its platform. The partnership has aimed to strengthen OpenAI’s inference infrastructure by integrating Cerebras’ purpose-built AI systems designed for faster response times.

Cerebras has developed AI hardware that combines compute, memory and bandwidth on a single large chip, reducing bottlenecks that typically slow inference on conventional systems. This integration has been positioned to support long outputs and real-time responses across a range of AI workloads.

OpenAI has stated that the additional low-latency capacity has been intended to improve how its models respond to complex tasks such as answering detailed questions, generating code, creating images and running AI agents. Faster response times have been linked to higher user engagement and the ability to support more demanding, real-time workloads.

The low-latency compute capacity has been set to integrate into OpenAI’s inference stack in phases, with expansion planned across workloads over time. The rollout has been scheduled to take place in multiple tranches, extending through 2028.

“OpenAI’s compute strategy is to build a resilient portfolio that matches the right systems to the right workloads. Cerebras adds a dedicated low-latency inference solution to our platform. That means faster responses, more natural interactions, and a stronger foundation to scale real-time AI to many more people,” said Sachin Katti of OpenAI.

“We are delighted to partner with OpenAI, bringing the world’s leading AI models to the world’s fastest AI processor. Just as broadband transformed the internet, real-time inference will transform AI, enabling entirely new ways to build and interact with AI models,” said Andrew Feldman, co-founder and CEO of Cerebras.

6 Trends That Will Shape Marketing In 2026

How Data-Driven Insights, Automation, & Generative AI Are Empowering Marketing Leaders

unBoxed 2025: Simplifying Advertising Through Innovation

From Impressions To Impact: Redefining ROI In The Digital First Era

The Phygital Revolution: Merging Physical & Digital Retail

Regional Content + AI = The Next Growth Engine In Influencer Marketing

6 Trends That Will Shape Marketing In 2026

How Data-Driven Insights, Automation, & Generative AI Are Empowering Marketing Leaders

unBoxed 2025: Simplifying Advertising Through Innovation

From Impressions To Impact: Redefining ROI In The Digital First Era

The Phygital Revolution: Merging Physical & Digital Retail

Regional Content + AI = The Next Growth Engine In Influencer Marketing

OpenAI Partners With Cerebras To Add 750MW Low-Latency AI Compute

OpenAI and Cerebras have expanded inference capacity through a partnership focused on faster, real-time AI responses with a phased rollout planned through 2028.

Related Posts

VAMA.app Launches Vama TV, In-App Spiritual Micro-Drama Platform

Amazon Ads Identifies AI, Creator Collaborations, & Customer-Centric Creativity As Key Advertising Trends For 2026

Latest

VAMA.app Launches Vama TV, In-App Spiritual Micro-Drama Platform

Amazon Ads Identifies AI, Creator Collaborations, & Customer-Centric Creativity As Key Advertising Trends For 2026

Publicis Groupe India Appoints Sonal Verma As MD Of Arc Worldwide India

Sagar Kadam Exits SPNI As AVP – Partnerships, Growth & Monetization – Digital Business