OpenAI just released ChatGPT agent, a new tool designed to actively complete work for you rather than just answer questions. ChatGPT agent can take actions on your behalf. It can browse websites and complete multi-step tasks from start to finish. Unlike OpenAI’s Operator, it’s more reliable and incorporates the best aspects of Deep Research. As a programmer who has been working in web automation for the past few years, I couldn’t resist the temptation to put it to the test—and here’s my honest review.
ChatGPT Agent Use CasesChatGPT agent is an evolution of Operator and Deep Research. Operator is an AI tool designed to browse the web and visually interact with websites, while Deep Research focuses on analyzing information deeply to generate comprehensive reports. ChatGPT agent can do both. It can seamlessly switch between reasoning and action—it knows when to stop and think and when to click or type something to move a task forward. The main use cases of ChatGPT agent are:
I’m going on vacation soon, so I asked ChatGPT agent to find the best restaurants in the city I’m visiting and create a new saved list on Google Maps, including notes about the signature dishes to try at each place The video’s sped up. It took the ChatGPT agent about 23 minutes to finish the task. First, ChatGPT agent searches for the best restaurants in the city, selects 10 of them, and collects information about their signature dishes. Then it pauses and asks me to take control to enter my Google login credentials. After that, it creates a list on Google Maps with notes for each restaurant and saves it. All of this is amazing! That said, the ChatGPT Agent isn’t ideal for every type of task. Let’s take a look at what it does well — and where it falls short. ChatGPT Agent: Where it shines and disappointsOpenAI’s demos emphasize the idea of users giving a task to the agent, then stepping away to focus on other tasks while the agent completes the work independently. While ChatGPT agent can often operate independently, it occasionally requires the user to be present at their computer. Whether it's to enter login credentials, solve captchas manually, or confirm a critical action, user involvement is sometimes necessary. Other times, the agent doesn’t need you to take control, but it may take so long to perform a simple action—like failing to select an item from a dropdown after several minutes—that you end up stepping in to help move the automation forward. Below is an example of this. I asked ChatGPT agent to go to a real estate site and search for homes to buy in Marbella, Spain, within a price range of €100k to €400k. It was doing a good job, but suddenly it failed to locate the maximum price in the dropdown. After 14 minutes, I took control of the site and did it myself. |