A new GPT-4o advanced voice demo from OpenAI is available, and it can teach you a language.
Although OpenAI has stated that ChatGPT won’t be receiving sophisticated voice functionality until later this year, it has kept us updated on what to anticipate. The most recent demonstrates the remarkable linguistic abilities of GPT-4o by instructing users in Portuguese.
GPT-4o’s remarkable expanded voice capabilities were presented earlier this year at the OpenAI spring update. They also unveiled a few screen-sharing and vision features, which we now know won’t be available until much later this year or maybe early next year.
The capacity of GPT-4o to function as a live translation tool was one of its main selling features in the initial demo, but some of the new demos are beginning to show us that it can also be an amazing language instructor. This is something I’ve experienced for myself to a lesser degree with the current voice model.
In a recent OpenAI video, a Spanish speaker with rudimentary Portuguese language skills and a native English speaker attempting to learn the language both used ChatGPT to advance their abilities. They ask it to slow down or explain words at different times, and it responds flawlessly each time.
GPT-4o for language learning
The wonderful thing about the new ChatGPT-4o advanced voice is that speech-to-speech functionality is built right in. This model simply understands what you’re saying naturally, in contrast to earlier models that had to first translate speech into text and then do the same in reverse for the response.
Being able to comprehend speech and audio naturally opens up a number of fascinating possibilities, such as the potential to operate in numerous languages, adopt different accents, or alter the tempo, tone, and vibrancy of a voice—essentially making it the ideal instructor.
Because of its native speech capabilities, it can listen to what you say and assess your accent, word choice, and delivery of specific phrases. When that happens, instead of evaluating a transcript, it might provide direct comment based on what it has heard.
Apart from all of this, GPT-4o also possesses remarkable thinking and problem-solving skills, which enable it to detect errors in less evident ways.
What else have we seen from GPT-4o?
The new advanced voice functions have been demoed numerous times, some of which were never intended for public release. One of them demonstrates how it can produce sound effects while narrating a story, while another demonstrates how it can use several distinct voices.
We’ve seen it deployed as a math teacher in OpenAI’s official YouTube broadcasts. The AI provides guidance and details on every facet of a math problem while the user works on an iPad with a shared screen in the video.
Since OpenAI added a chat interface to its GPT-3 model in November 2022, one of the biggest advancements in artificial intelligence seems to be advanced voice mode and, in particular, natural speech understanding.