OpenAI Announces Operator AI Agent in Preview

OpenAI Announces Operator AI Agent in Preview

OpenAI today announced Operator, its take on an AI agent that can act independently and complete on your behalf using its own browser. It’s available now in preview for those with a ChatGPT Pro subscription, but will expand soon to Plus, Team, and Enterprise users before being integrated directly into ChatGPT.

“Operator is an agent that uses its own browser to go to the web to perform tasks for you,” the company explains. “It is currently a research preview, meaning it has limitations and will evolve based on user feedback. Operator is one of our first agents, which are AIs capable of doing work for you independently–you give it a task and it will execute it.”

Operator is quite limited right now: For those unfamiliar, ChatGPT Pro is the $200-per-month ChatGPT offering that’s aimed at researchers, engineers, and others with extreme AI needs. But Operator looks intriguing. Indeed, it’s curiously reminiscent of the radical new web browser that Arc maker The Browser Company previewed back in October.

According to OpenAI, Operator can perform a wide range of web-based tasks–like filling out forms, ordering groceries, creating memes, and more–on your behalf. The company hopes that by starting with a small audience, it can learn from its users and from partners–DoorDash, Instacart, OpenTable, Priceline, StubHub, Thumbtack, Uber, and others are onboard–how to best make AI agents useful for customers and businesses alike.

Operator is based on a technology called Computer-Using Agent (CUA) that combines GPT-4o’s vision capabilities with advanced reasoning developed through reinforcement learning. It’s trained to interact with the graphical interface elements found on the web and then take action without requiring custom browser integrations. That is, it “sees” the web using screenshots and then it interacts by simulating mouse clicks and keyboard typing. When it encounters problems or makes mistakes, CUA can self-correct. And if that doesn’t work, it will return to the user for help.

Like ChatGPT, you interact with Operator using a text-based prompt. It supports third-party partner apps, so you might ask the Instacart app via prompt to search specific websites for a recipe and then order the necessary ingredients for delivery after specifying which you already have. It can run multiple tasks simultaneously in separate conversations. And the collaboration with outside partners means its capabilities will only grow in time.

Over time, OpenAI plans to expose the CUA model powering Operator as an API developers can use to build their own agents. It will improve Operator to handle longer and more complex workflows. And it will expand Operator to other ChatGPT customers, as noted.

You can learn more about Operator on the OpenAI website.

Tagged with

Share post

Thurrott