OpenAI Launches ChatGPT Agent to Handle Multi‑Step Tasks

OpenAI Launches ChatGPT Agent to Handle Multi‑Step Tasks

OpenAI unveiled its new ChatGPT Agent on July 17, 2025. The tool allows paid subscribers to have the AI complete multi-part tasks like planning events, managing schedules, researching topics, and even making purchases, all while working through its own secure virtual computer. Users maintain control since the agent requests approval before taking any action and can be paused or interrupted at will.

The all-new ChatGPT Agent merges two earlier tools, first is Operator, which enabled web browsing and form-filling, and second is Deep Research, which supported in-depth analysis into a single interface. Available now for Pro, Plus, and Team users, the agent can be activated via a dropdown or by entering /agent in chat.

To highlight its capabilities, OpenAI demonstrated several use cases. One included organizing a wedding: the agent searched for accommodations, checked the weather, selected an outfit, pitched gift ideas, and secured reservations, all with minimal user interaction. In another scenario, the agent was asked to analyze competitors and produce a presentation, pulling data from the web, generating slides, and building an editable slide deck.

Despite its abilities, reviewers say the agent remains cautious and sometimes unreliable. The Verge reported that while it can research recipes and simulate adding items to a cart, it frequently fails to complete transactions due to limited permissions. PC Gamer echoed that sentiment, quoting OpenAI CEO Sam Altman, who described the tool as useful for routine tasks but said users should avoid relying on it for high-stakes decisions.

Security measures are central to the design. The agent operates in a “watch mode” that pauses actions when users look away, imposes restrictions on financial transactions, and disables memory to limit data exposure. A specialized classifier monitors for potential misuse in sensitive areas like biology or finance.

Benchmark tests suggest the agent outperforms earlier models. OpenAI claims it scores 41.6 percent on the “Humanity’s Last Exam” and 27.4 percent on the challenging FrontierMath, surpassing prior o3‑based models.

Performance on SpreadsheetBench also reportedly exceeds that of Microsoft’s Copilot in Excel.

Early access is now live: Pro subscribers receive 400 agent-driven messages monthly, while Plus and Team users receive 40. Enterprise and Education customers will gain access later, though users in the European Economic Area and Switzerland will need to wait.

Analysts view the tool as OpenAI’s most practical agent yet. Reuters noted its alignment with moves by Microsoft, Salesforce, and Oracle to roll out similar technologies aimed at productivity and cost reduction.

As adoption grows in sectors like travel, finance, and business workflows, the ChatGPT Agent will face tests in reliability, privacy, and compliance.

But by combining browsing, coding, and analysis capabilities into a single, user‑supervised assistant, OpenAI is offering a more hands‑on version of AI assistance.