How AWS hopes to solve generative AI's last-mile problem

Today: An interview with AWS AI chief Swami Sivasubramanian, why Amazon held off on deploying Microsoft 365 after last year's security debacle, and the latest enterprise moves.

How AWS hopes to solve generative AI's last-mile problem
Photo by david Griffiths / Unsplash

Welcome to Runtime! Today: An interview with AWS AI chief Swami Sivasubramanian, why Amazon held off on deploying Microsoft 365 after last year's security debacle, and the latest enterprise moves.

(Was this email forwarded to you? Sign up here to get Runtime each week.)


Over the top

Former AWS CEO Adam Selipsky liked to compare his company's position at the onset of the enterprise generative AI frenzy to a runner taking their first steps in a long race. A year ago the analogy was a little hard to accept at face value given that it appeared Microsoft was several steps into the race while AWS and Google Cloud were still tying their shoes, but as it turned out, everybody was actually still warming up.

As 2024 winds down, most enterprises are still struggling to turn their generative AI experiments into actual production applications that can deliver an acceptable return on investment while scaling to meet their needs. They've done more than 99% of the work needed to accomplish that goal, but the last 1% has proven much harder than anticipated, said Swami Sivasubramanian, vice president for AI and data services at AWS.

In an interview last week at re:Invent 2024, Swami discussed the obstacles that have prevented enterprise generative AI applications from really making an impact, why Amazon decided it needed to develop world-class general-purpose AI models, and the rise of industry-specific AI models. Selected excerpts follow below.

On the GenAI production gap

Swami: The interesting thing is everybody talks about RAG as a universal panacea, but the reality is it did not work with structured data, the reality is it did not work with multimodal data. Now all your data lakes and data warehouses are now ready for RAG, and it just suddenly changes the game for them.

And then final one is, especially in regulated industries, hallucination is a real big fear. They actually have something almost 99% there, and then they used to tell me the last one percent turns out to be the longest, because I can't afford to get this wrong.

On the launch of Amazon's Nova models:

Swami: We do think it's important in the same way … I built [Amazon] RDS as well. While we actually supported all major databases, we thought it was important that we take the learnings of customers and then reinvent databases for the cloud with Aurora and Dynamo. Our strategy here is the same, because especially as we work with more and more customers, we think it's important that we actually continue to double down on making sure we meet customers on their pain points and actually make sure these models work.

And this is an area where, again, we don't think that one model is going to rule the world, just like RDS has Aurora but we also have other [database] engines doing exceptionally well. The same is going to be true here. We think Nova is going to be extremely popular.

On the small-model movement:

Swami: Right now, if you see what a typical development project in the GenAI world is like, product managers and developers end up profiling, here are the, let's say, 200 sample tasks or use cases that happen, and then here is the prompt. And let's find out which kind of prompts need to go to the biggest model, and then which ones need to go to the medium, and then the small. Then they actually do some kind of simple rule engine to route this.

This model selection ends up taking something like two to three months before they could actually get something, just to pick the model and then the routing engine. You're right that there is a really good set of use cases where you don't need the intelligence of the big model; I noticed this trend where people loved when they built the demo on the big model, and then they didn't like the bill. Then they actually ended up saying, okay, for these use cases, I can actually go and use the smaller model.

Read the rest of the full interview on Runtime.


As the year winds down, now is a great time to sponsor Runtime and get your message in front of the enterprise tech industry leaders and decision makers that are looking for new solutions in 2025. Extend your 2024 budget into next year: Book a weekly sponsorship of Runtime before 12/31 and get a second week free! See our rate card and formats here.


Customer-centric

The AWS-Microsoft rivalry is probably the most interesting conflict in enterprise tech, with apologies to Snowflake, Databricks, and Google Cloud. The two companies are locked in fierce competition not just for enterprise tech budgets, but for employees in the Puget Sound region they call home.

Bloomberg reported Thursday that Amazon sent Microsoft a list of security demands last year following the discovery of an enormous breach in Office 365, before it followed through on an agreement to become what has to be one of the largest Office 365 customers on the planet. “They’ve done yeoman’s work,” Amazon CISO CJ Moses told Bloomberg. “We’ve given them some pretty steep tasks.”

The two companies worked together on security enhancements to one of Microsoft's most important products, according to Bloomberg, and Amazon plans to roll out Office 365 starting next year. Corporate rivalries like Amazon-Microsoft actually involve more cooperation than outside observers might think, but Microsoft security czar Charlie Bell can't have been thrilled about taking marching orders from his former employer.


Enterprise moves

John Pasta is the new executive vice president of data center solutions at JLL, joining the commercial real estate company from QTS Data Centers.

Girish Rao is the new senior vice president of infrastructure at Mozilla, following several operations tech leadership roles at Warner Bros Discovery, Electronic Arts, and Equinix.

William da Cunha is the new chief revenue officer at Statsig, joining the product testing company after more than eight years at Cloudflare.

Prasenjit Dasgupta is the new chief financial officer at Hyland, following similar roles at Digital.ai and Motorola Solutions.


The Runtime roundup

ServiceTitan raised $625 million in one of the few 2024 IPOs for enterprise tech companies, valuing the home-improvement industry's software supplier at almost $9 billion.

Broadcom shares rose 14% in after-hours trading after it reported a huge increase in profit, despite missing Wall Street estimates for revenue.

Yahoo jettisoned nearly 25% of its legendary Paranoids security team this year, according to TechCrunch, which is a sad development for a group that was once considered one of the best in the world.


Thanks for reading — see you Saturday!

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to Runtime.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.