Ready or not, here come the AI agents

No one really agrees on a strict definition of "agent," but recent breakthroughs in large-language models have allowed companies to build enhanced versions of chatbots that can respond to natural-language queries with a plan of action.

Ready or not, here come the AI agents
Photo by Campaign Creators / Unsplash

A computer is very good at executing a task when you tell it exactly what you want it to do and exactly how you want it done. Until now, when businesses needed to understand the more ambiguous needs of their customers they've turned to humans to get the job done, which is exactly the problem that the AI merchants of our time were determined to solve in September.

Nearly a dozen enterprise tech companies announced plans for AI agents last month, expanding two years of investments in generative AI technology in search of a winning formula. Salesforce, Microsoft, Workday, ServiceNow, Accenture and several others all embraced the notion of the agent as the missing piece of the AI puzzle: "We're going to do not only the largest deployment of agents, but the best possible agents you could possibly have," said Salesforce CEO Marc Benioff during a September press conference, as determined as ever to crank the hyperbole to 11.

No one really agrees on a strict definition of "agent," which of course allows the puffery to flow, but recent breakthroughs in large-language models have allowed companies to build enhanced versions of chatbots that can respond to natural-language queries with a plan of action.

Many of the agents introduced last month build on the "reasoning" capabilities of new LLMs like OpenAI's o1 models. It's a term that is as annoyingly anthropomorphized as everything in this industry and describes how applications built around these models can take a new piece of information, evaluate that new data against a massive pool of old data while determining how it relates to a predetermined list of tasks it has been empowered to execute, and select the best path forward.

"Planning and reasoning is very hard," said Ashok Srivastava, chief data officer at Intuit. "It's hard for humans to do, it's much harder for machines to do."

It's also expensive, which is why vendors and AI enthusiasts in the finance department are so eager to find a tool that can inject good-enough reasoning and execution into business tasks like customer support or cash-flow management, which would theoretically allow them to process more tasks with fewer people. It will take a long time before that goal is reached, but dozens of companies are placing bets right now that agents will be the breakthrough promised by the generative AI hype cycle.

"Right now, the vast majority of the world has never heard of [Microsoft] Copilot or used a copilot, so it will take years for people to get comfortable and for the underlying technology to get better," said Nicholas Holland, vice president of product for HubSpot. "Agents are even further behind on that, but we will see a multiyear arc to where that becomes the norm as well."

Not so secret

Agents can be thought of as the second generation of the RPA movement, which promised to robotically automate the business processes all companies need to function on a day-to-day basis. A categorical example of RPA was the invoice-processing bot system, and this graphic produced by Menlo Ventures illustrates the number of steps involved in a typical workflow.

But RPA systems fell down when it came to handling unstructured data or poorly structured data, and developed a maintenance-heavy reputation as a result. Generative AI, on the other hand, was designed to work with unstructured data, even if it needs tools like RAG to reduce the number of errors it tends to make processing that data.

"We started with scripted workflows, we have RPA workflows, we have conversational workflows," said Dorit Zilbershot, vice president of product management at ServiceNow. "It's just an evolution of the technology to now have an AI agent workflow."

AI researchers have been talking about agents for decades, but newer LLMs can deliver on the six properties laid out by Shopify's Julia Winn that make up agents: they have perception, interactivity, persistence, reactivity, proactivity, and autonomy.

"Where the agentic stuff started to get interesting is, what if we move beyond that kind of paradigm of just chat, and what if for the first time you had a mixture of automation or some sort of sequential-type things to get a job done," Holland said.

The call center is the ground zero of the generative AI revolution, and agents are likely to make their debut at the other end of your next phone or text conversation with a customer-service department. At this point nearly everyone can agree that the phone tree has been one of the more exasperating cost-saving "improvements" in customer-service technology, and companies are betting billions that agents could allow customers to walk away satisfied after a completely digital interaction.

Much of that hope rests on the ability of AI agents and their underlying models to respond to the imprecise language most people use when trying to describe the problem or goal they're trying to solve.

"They're talking about something that can, for instance, in the case of voice agents, listen in real time with super-high accuracy and then be interrupted," said Scott Stephenson, co-founder and CEO of Deepgram. "Maybe the agent is starting to respond, and halfway through the person realizes, 'whoa, this isn't what I'm talking about, no, I actually need this other thing.' Phone trees would never be able to deal with that."

Intuit recently announced that customers using its QuickBooks accounting system will be able to use a series of agents that handle several different aspects of the most important business process for a small or medium-sized company: managing their cash flows.

"What we have done is we've created an agentic workflow that includes the processing of relevant customer artifacts — including user-submitted images, documents, emails — and then orchestrating any necessary follow-up actions to specialized AI agents like an invoice-processing agent or bill-creation agent for the customer's review," Srivastava said.

Data scrubbing

However, as the enterprise software industry tries to cram agents into anything and everything, it's not clear how many companies are ready to take advantage of the technology. Like any generative AI technology, unlocking whatever special outcome is promised by the tool requires sharing a ton of data with that tool in tool-friendly ways.

When Intuit started building an internal AI platform that allows its developers to create agents for internal and customer-facing use, "we made a huge investment in modernizing our data platform," Srivastava said. That required breaking down the barriers between different types of data and creating a map of all that data that drew links between similar data types.

"Data hygiene is even more important today than it was in the past. Everything can be knowledge for those AI agents, and so they're really able to leverage unofficial knowledge that exists in organizations and uncover existing patterns without a lot of investments from people."

Salesforce demonstrated an agent last month as an example of how a retailer could build a system to respond to a customer inquiry about returning a sweater that didn't fit. The agent was able to understand the customer's question in natural language, ask follow-up questions, advise the customer of shipping times as well as if the preferred size was in stock at a nearby store, but that whole operation hinges on whether or not the retailer's data strategy makes it possible to access all that data quickly and cleanly.

"Data hygiene is even more important today than it was in the past," Zilbershot said. "Everything can be knowledge for those AI agents, and so they're really able to leverage unofficial knowledge that exists in organizations and uncover existing patterns without a lot of investments from people."

The herd mentality of the generative AI era means that all sorts of "agents" are being thrown at enterprises and end users right now, and it will take some time to sort out which agentic workflows actually deliver something above and beyond, and which are just glorified chatbots. And given that companies like Salesforce plan to charge per conversation handled by their agents, enterprises will want to be certain that these technologies work as advertised before turning their brands over to the agent experience.

"The first adoption barrier is just, 'change is hard,'" Holland said. But he suggested a way that companies thinking about using AI agents for customer service could test out the services on a smaller group: almost everyone given the opportunity to talk to a human will choose to talk to a human, but if they're told they'll have to wait ten minutes to talk to a human, some of them will probably give the AI agent a shot.

"There is a portion of people who will try that, and then if these assistive agents can do as good of a job as a human, then people will start to believe and will start to have better experiences and will start to move forward," he said.

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to Runtime.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.