On Thursday, OpenAI launched ChatGPT Agent, a brand new characteristic that lets the corporate’s AI assistant full multi-step duties by controlling its personal internet browser. The replace merges capabilities from OpenAI’s earlier Operator device and the Deep Research characteristic, permitting ChatGPT to navigate web sites, run code, and create paperwork whereas customers preserve management over the method.
The characteristic marks OpenAI’s newest entry into what the tech business calls “agentic AI“—methods that may take autonomous multi-step actions on behalf of the person. OpenAI says customers can ask Agent to deal with requests like assembling and buying a clothes outfit for a selected event, creating PowerPoint slide decks, planning meals, or updating monetary spreadsheets with new knowledge.
The system makes use of a mixture of internet browsers, terminal entry, and API connections to finish these duties, together with “ChatGPT Connectors” that combine with apps like Gmail and GitHub.
Whereas utilizing Agent, customers watch a window contained in the ChatGPT interface that reveals all the AI’s actions happening inside its personal non-public sandbox. This sandbox options its personal digital working system and internet browser with entry to the true Web; it doesn’t management your private machine. “ChatGPT carries out these duties utilizing its personal digital laptop,” OpenAI writes, “fluidly shifting between reasoning and motion to deal with advanced workflows from begin to end, all primarily based in your directions.”
Like Operator earlier than it, the agent characteristic requires person permission earlier than taking sure actions with real-world penalties, akin to making purchases. Customers can interrupt duties at any level, take management of the browser, or cease operations solely. The system additionally features a “Watch Mode” for duties like sending emails that require energetic person oversight.
Since Agent surpasses Operator in functionality, OpenAI says the corporate’s earlier Operator preview site will stay useful for a number of extra weeks earlier than being shut down.
Efficiency claims
OpenAI’s claims are one factor, however how properly the corporate’s new AI agent will truly full multi-step duties will range wildly relying on the scenario. That is as a result of the AI mannequin is not a whole type of problem-solving intelligence, however somewhat a posh grasp imitator. It has some flexibility in piecing a situation collectively but in addition many blind spots. OpenAI skilled the agent (and its constituent components) utilizing examples of laptop utilization and power utilization; no matter falls exterior of the examples absorbed from coaching knowledge will possible nonetheless show troublesome to perform.

