OpenAI Announces GPT-4o and More at Spring Update Event

OpenAI Spring Update
Screenshot

At a special live Spring Update event today, OpenAI announced a new flagship language model, a Mac client, and more free tools for ChatGPT.

“It is so important to us to have a product that we can make truly available and broadly available to everyone,” OpenAI CTO Mira Murati said at the start of the event. “And we are always trying to find out ways to reduce friction, so everyone can use ChatGPT wherever they are.”

Among the announcements, OpenAI is releasing a desktop version of ChatGPT for Plus users on macOS today that works with a Spotlight search-like UI and keyboard shortcut. (A Windows version is coming later this year.) And it is adding more features to the free version of ChatGPT, its AI chatbot, including GPT-4 level intelligence, multi-modal responses, data analysis and chart creation, photo chats, file upload (for summarizing, writing, and analysis), custom GPT support through the GPT Store, and access to Memory, so the chatbot can remember previous conversations and use them for context.

But the biggest news, of course, is the introduction of GPT-4o, OpenAI’s new flagship large language model (LLM). According to the firm, GPT-4o provides GPT-4 level intelligence but improved text, audio, and vision capabilities and much faster performance. (The “o” in GPT-4o stands for “omni.”)

“For the past couple of years, we have been very focused on improving the intelligence of these models, and they’ve gotten pretty good,” Ms. Murati said. “But this is the first time that we are really making a huge step forward when it comes to the ease of use. This is incredibly important because we are looking at the future of interaction between ourselves and the machines. And we think that GPT-4o is really shifting the paradigm into the future of collaboration. It is natural and far easier.”

GPT-4o can accept any combination of text, audio, and image inputs, and it can generate any combination of text, audio, and image outputs. The goal here is for real-time communications, and this initial release can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, similar to human conversational response times. From a language perspective, GPT-4o can match the performance of GPT-4 Turbo with English text and software code, but it is significantly improved, and faster, with non-English text, and 50 percent cheaper for developers to use with the API. GPT-4o is “especially better” at vision and audio understanding compared to existing models, Open AI says.

If you’re interested in this technology, I recommend watching a replay of the event live stream. There were a few glitches as one might expect of a live technology demo, but it was truly impressive, with the AI responding to questions in a variety of voices and emotions, singing, solving math problems, describing a person’s emotions, and explaining a complex code base.

GPT-4o is rolling out to ChatGPT Plus and Team users now, and it will come to Enterprise users soon, OpenAI says. It’s also rolling out to ChatGPT Free users today, albeit with usage limits: The message limit for Plus users is up to five times greater than it is for Free users, and Team and Enterprise users will have even higher limits.

You can learn more about GPT-4o on the OpenAI website.

Tagged with

Share post

Thurrott