When OpenAI celebrated its Spring launch event In May, one of the highlights was their demonstration of the new Voice Mode in ChatGPTsupercharged with GPT-4o New video and audio capabilities. The long-awaited new Voice Mode is finally here (sort of).
Also: Best AI Chatbots of 2024: ChatGPT, Copilot, and Valuable Alternatives
On Tuesday, OpenAI announced via a post on X that the startup was rolling out voice mode in alpha to a small group of ChatGPT Plus users, offering them a smarter voice assistant that can be interrupted and respond to their emotions.
We are starting to roll out Advanced Voice Mode to a small group of ChatGPT Plus users. Advanced Voice Mode offers more natural, real-time conversations, lets you interrupt at any time, and detects and responds to your emotions. photo.twitter.com/64O94EhhXK
— OpenAI (@OpenAI) July 30, 2024
If you’re participating in the alpha, you’ll receive an email with instructions and a message in the mobile app, as shown in the video above. If you haven’t received a notification yet, don’t worry. OpenAI has shared that it will continue to add users on an ongoing basis, and the plan is for all ChatGPT Plus subscribers to be able to access the app in the fall.
In the original demo at the launch event, shown below, the company showed off Voice Mode’s multimodal capabilities, including assisting with content on users’ screens and using the user’s phone camera as context for a response.
Unfortunately, the alpha version of Voice Mode won’t have these features. OpenAI shared that “video and screen sharing capabilities will be released at a later date.” The startup also said that since the technology was originally demonstrated, it has improved the quality and security of voice conversations.
OpenAI tested the voice capabilities with more than 100 external red team members in 45 languages, according to Thread X. The startup also trained the model to speak only the four preset voices, block outputs that deviate from those designated voices, and implement guardrails to block requests.
The startup also said it will take user feedback into account to further improve the model and will share a detailed report on GPT-4o’s performance, including limitations and safety assessments, in August.
Also: Google’s next-generation AI tools help you hyper-target your ad campaigns
You can become a ChatGPT Plus Subscriber for $20 per month. Other membership benefits include advanced data analysis characteristics, Image generationand priority access to GPT-4o.
A week after OpenAI introduced this feature, Google introduced a similar feature called Gemini livewhich is not yet available to users. This may change soon in the Event created by Google Coming soon in a few weeks.