OpenAI unveils three audio models for real-time voice tasks
0
World

OpenAI unveils three audio models for real-time voice tasks

May 7, 2026
Scroll

Posted 1 hour ago by

OpenAI introduced three new audio models for its developer platform on Thursday, aiming to make voice-based software agents more conversational and capable of completing tasks in real time.The launch of the application programming interface (API) moves the ChatGPT maker beyond transcription and chat toward agents that can listen, translate, and act during live conversations.The new models— GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper—are now available for testing in the company’s developer playground.GPT-Realtime-2 is designed to handle more complex requests, call external tools, manage interruptions, and maintain context across longer voice sessions.The second model supports translation from more than 70 languages into 13 output languages, targeting use cases such as customer support, education, and other real-time communication scenarios.GPT-Realtime-Whisper provides live speech-to-text capabilities, enabling captions, meeting notes, and workflow updates to be generated as a person speaks.Companies testing the models include online real estate marketplace Zillow, travel platform Priceline, and European telecommunications firm Deutsche Telekom.Pricing for GPT-Realtime-2 starts at 32 per million audio input tokens, while GPT-Realtime-Translate costs 0.034 per minute and GPT-Realtime-Whisper is priced at 0.017 per minute.

OpenAI unveils three audio models for real-time voice tasks
Emirates 24/7
Emirates 24/7

Coverage and analysis from United Arab Emirates. All insights are generated by our AI narrative analysis engine.

United Arab Emirates
Bias: lean right

People's Voices (0)

Leave a comment
0/500
Note: Comments are moderated. Please keep it civil. Max 3 comments per day.
You might also like

Explore More