OpenAI gives ChatGPT new powers to see, hear
OpenAI on Monday released a higher performing and even more human-like version of the artificial intelligence technology that underpins its popular generative tool ChatGPT, making it free to all users.
The update to OpenAI's flagship product landed a day before Google is expected to make its own announcements about Gemini, the search engine giant's own AI tool that competes with ChatGPT head on.
"We're very, very excited to bring GPT-4o to all of our free users out there," Chief Technology Officer Mira Murati said at the highly anticipated launch event in San Francisco.
The new model GPT-4o -- the "O" stands for omni -- will be rolled out in OpenAI's products over the next few weeks, the company said, with paid customers having unlimited access to the tool.
The company said the model could generate content or understand commands in voice, text, or images.
"The new voice (and video) mode is the best computer interface I've ever used. It feels like AI from the movies," said OpenAI CEO Sam Altman in a blog post.
Altman has previously pointed to the Scarlett Johansson character in the movie "Her" as an inspiration for where he would like AI interactions to go.
"Talking to a computer has never felt really natural for me; now it does," he added.
Murati and engineers from OpenAI demonstrated the new powers of GPT-4o at the virtual event, posing challenges to the beefed-up version of the ChatGPT chatbot.
The demo mainly featured OpenAI staff members asking questions to the voiced ChatGPT, which responded with jokes and human-like banter.
The bot served as an interpreter from English to Italian, interpreted facial expressions and walked one user through a difficult algebra problem.
The company said that GPT-4o had the same powers as the previous version when it came to text, reasoning, and coding intelligence, and set new industry standards for multilingual conversations, audio, and vision.
In one demonstration, ChatGPT successfully interpreted an employee's surroundings through a smartphone camera, speaking in a friendly, feminine voice, not unlike the AI bot in the film "Her".
"Hmmm from what I can see it looks like you're in some kind of recording or production set-up with lights, tripods… you might be gearing up to shoot a video or make an announcement?" the ChatGPT bot said.
- 'Take our time' -
Anticipation was high in recent weeks that OpenAI would release an AI-amped version of an online search tool to compete with Google search engine, but on Friday Altman said this would not be the case.
Observers were also waiting for the launch of GPT-5, but Altman said last week that his company would "take our time on releases of major new models."
The event is just the latest episode in the AI arms race that has seen OpenAI-backer Microsoft surpass Apple as the world's biggest company by market capitalization.
OpenAI and Microsoft are in a heated rivalry with Google to be generative AI's major player, but Facebook-owner Meta and upstart Anthropic are also making big moves to compete.
All the companies are scrambling to come up with ways to cover generative AI's exorbitant costs, much of which goes to chip giant Nvidia and its powerful GPU semiconductors.
Making the new model available to all users may raise questions about OpenAI's path to monetization amid doubts that everyday users are ready to pay a subscription.
Until now, only lower performing versions of OpenAI or Google's chatbots were available to customers for free.
"We are a business and will find plenty of things to charge for," Altman said on his blog.
The AI makers are also feeling pressure from publishers and creators, who are demanding payment for any content used to train the models.
OpenAI has signed content partnerships with the Associated Press, the Financial Times and Axel Springer, but is also caught in a major lawsuit with The New York Times.
AI companies have also been confronted with separate lawsuits from artists, musicians, and authors in US courtrooms.
M.Vargas--ESF