Computer-use agents: Because clicking is hard

Today: OpenAI unveils its take on AI agents that promise to take all the drudgery out of using a computer, more on the massive Project Stargate circus, and this week's enterprise moves.

operators from the early days of telephones sit in front of a bank of connections, helping place calls
You can keep the dime. (Credit: JoneMarkuleta, CC BY-SA 4.0)

Welcome to Runtime! Today: OpenAI unveils its take on AI agents that promise to take all the drudgery out of using a computer, more on the massive Project Stargate circus, and this week's enterprise moves.

(Was this email forwarded to you? Sign up here to get Runtime each week.)


Isn't that the way they say it goes?

Almost all the breakthroughs in computing over the last fifty years can be described as abstractions; increasingly simple ways to tap into computing power shown by advances like the graphical user interface, the web browser, and the app store. The merchants of today's computers are convinced that AI agents are the next major abstraction on that timeline, betting we'll pay for on-demand personal assistants for even the most mundane of tasks.

OpenAI introduced Operator on Thursday, a computer-use agent [CUA] that "can be asked to handle a wide variety of repetitive browser tasks such as filling out forms, ordering groceries, and even creating memes." It's a slightly different take on a similar concept introduced in Anthropic's Claude last year.

  • While Claude's agent actually takes over your computer, Operator is a distinct website that users can visit and enter a prompt to execute a task, such as the evergreen generative AI example of booking a trip.
  • Operator launches a virtual browser in response to the prompt that goes about the business of checking flights, hotels, and other travel necessities while the user watches, which is the worst-ever idea for a Twitch channel.
  • "Combining GPT-4o's vision capabilities with advanced reasoning through reinforcement learning, CUA is trained to interact with graphical user interfaces (GUIs)—the buttons, menus, and text fields people see on a screen," it said in a blog post.
  • And when something inevitably goes wrong with this "research preview," Operator prompts the user to intervene and get things back on track.

As you may have noticed, enterprise software companies are extremely eager to help customers build AI agents that their customers can interact with instead of talking to actual people, who need some form of compensation and health care and things like that. OpenAI said it is working with companies like Doordash, Instacart, and Priceline to optimize their sites for Operator.

  • "Operator⁠ transforms AI from a passive tool to an active participant in the digital ecosystem. It will streamline tasks for users and bring the benefits of agents to companies that want innovative customer experiences and desire higher rates of conversion," OpenAI said.
  • At some point the company plans to allow third-party developers to access an API in order to build Operator's capabilities into their own sites.
  • However, according to Every (which quoted a different Jim Croce lyric), "in this research preview mode, Operator is also blocked by OpenAI from accessing certain resource-intensive sites like Figma or competitor-owned sites like YouTube for performance or legal reasons," suggesting you might need to be in OpenAI's good graces to use the service in your own apps.

Needless to say, nobody has any idea if computer-use agents like Operator will work reliably at scale. Right now, the service is only available to OpenAI customers that pay $200 a month for its Pro tier of services.

  • There could be several enterprise applications that make sense if it does actually work, such as logistics, financial analysis, or generating business-intelligence reports.
  • But the concept of agentic AI is just getting off the ground, far behind the early efforts to build generative-AI applications that many enterprises are still struggling to get into production.
  • Best-case scenario, computer-use agents like Operator are just the latest in a long line of abstractions that have made technology easier to use.
  • Worst-case scenario, they're the equivalent of one of those vegetable choppers you'll find on TikTok; they work, kind of, but are not something serious people use when cooking.

Check out this month's edition of the Runtime Roundtable, which asked our panelists to answer a question near the top of almost every engineering leader's mind: What are the best ways to implement AI tools like agents in the software development process? As always, thanks to our panel of experts for helping their fellow tech leaders figure out the best course of action for their teams, and special thanks to Heroku from Salesforce for sponsoring this month's edition.

If you're interested in sponsoring a future edition of the Runtime Roundtable, please contact us here.


Things will not calm down

It turns out that $500 billion is a number that will get people buzzing, regardless of how serious a number it really is. OpenAI's Project Stargate has been the talk of the enterprise infrastructure world this week, and there were several details that didn't come up during the remarks made by the Fabulist Four assembled in the Roosevelt Room on Tuesday.

Crusoe — recently profiled in Runtime — is already building the first Project Stargate data center in Texas on Oracle's behalf, according to Reuters. Last year The Information reported that Crusoe was working on a data center for Oracle, which means despite what OpenAI CEO Sam Altman implied during Tuesday's press conference, plans for Project Stargate were well underway before President Trump won last year's election.

And Altman nemesis Elon Musk — who is working for Trump in some capacity — lent whatever is left of his credibility to the sentiment that the total amount of $500 billion that the group plans to invest in AI infrastructure is a mirage. "Another Republican close to the White House went further, saying Trump’s staff is 'furious' over Musk using his massive social media platform to pour cold water on the infrastructure deal that Trump called 'tremendous' and 'monumental' just a day prior," former Protocol owner Politico reported; we're not even a full week into this term.


Enterprise moves

Matthew Fitzpatrick is the new CEO of Invisible Technologies, joining the AI consulting company from McKinsey.

Todd Persen is the new CTO at Hydrolix, with co-founder Hasan Alayli moving to the chief scientist role.

Nancy Hensley is the new chief product officer at EDB, joining the PostgresDB vendor after operational roles at Stats Perform and IBM.

Ramp made several executive announcements this week, promoting Will Petrie to CFO, Geoff Charles to chief product officer, and Nik Koblov to executive vice president of engineering.

Deepak Argawal is the new chief AI officer at LinkedIn, returning to the company in an AI leadership position after several years at Pinterest.

Mike Pyle is the new chief revenue officer at Cribl, following similar roles at GitLab and Heroku from Salesforce.


The Runtime roundup

The Trump administration shut down an inquiry into the Salt Typhoon hack of basically the entire U.S. communication network by Chinese state actors, which… sure, fine, whatever.

Twilio's stock rose more than 10% in after-hours trading after it told analysts to expect a better profit forecast for the next several years.


Thanks for reading — see you Saturday!

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to Runtime.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.