Nvidia's agentic AI push; Snowflake cuts inference costs
Today on Product Saturday: Nvidia and Snowflake try to get more enterprises on the AI train by focusing on safety and costs, and the quote of the week.
Anyscale is built around Ray, an open-source project that was designed to help AI workloads scale. But in recent years, commercial pressures have forced several companies with similar open-source origin stories to put restrictions on their projects to ward off competition.
Open-source software was an essential component of enterprise tech infrastructure over the last decade, but it has proven less resilient as a business model for companies built around open-source projects in recent years. As a generation of infrastructure startups backs away from open-source software licensing, Anyscale is moving in the opposite direction.
Founded in late 2019 by a team of University of California, Berkeley computer scientists, Anyscale is built around Ray, an open-source project that was released under the permissive Apache 2.0 license. Ray was designed to help AI workloads scale to reach users around the globe, which is an even more challenging problem than scaling traditional applications, according to Anyscale co-founder Robert Nishihara, who first developed Ray alongside Anyscale co-founder Philipp Mortiz inside Berkeley's RISELab.
Now joined by longtime Aruba and HPE executive Keerti Melkote, who took over as CEO last month, Anyscale is determined to make sure Ray remains a vibrant open-source project while providing commercial services to companies that need help implementing it in their infrastructure. That could be a challenge: The company has raised $259 million to date, and commercial pressures have forced several enterprise tech companies with similar open-source origin stories to put restrictions on their projects to ward off competition.
But in a recent interview, Melkote and Nishihara, who stepped down as CEO when Melkote arrived to focus on product strategy, outlined how they think Anyscale will be able to compete against any number of companies that might want to offer commercial services based around Ray. The first part of the interview, focused on the pending arrival of production-ready enterprise AI applications, ran in last week's Runtime newsletter.
This interview has been edited and condensed for clarity.
Runtime: As this market takes off and is moving so fast, I would imagine that a lot of sophisticated users of AI technologies, the ones who are scaling these models, are pretty capable of implementing Ray on their own. I would imagine there's a second layer of customer who's just sort of getting into this that values Anyscale's managed service. But when you're building a managed service around an open-source project, there are a lot of other people who could also do that. So how are you thinking about that as you move into this growth stage in terms of your relationship with Ray and licensing strategy with Ray, and building a product around that?
Melkote: There's a group of enterprises that are not as deep from an infrastructure team perspective. They have an application team, they have a business, they have to build an AI pipeline on top, but they don't necessarily have the infrastructure teams to build their own Ray clusters, maintain it and manage it and so on. That's really where the managed service comes in. The ability for us to go in there and be the infrastructure team for our customers by delivering to Ray as a managed service, that's sort of the high-level opportunity.
The value of our platform, of our product, can't just be managing the open-source project. That's not enough.
But it also goes beyond that in terms of the proprietary value we offer. Cost is a big consideration for customers. Developer productivity is another very important area that they care about. As they onboard more and more AI and ML developers, how do we make sure they have a clean way to experiment, develop and take the model into production? So we are finding more value around the core project itself that we can offer to our customers, and that's the intent behind how we want to monetize it.
Nishihara: The value of our platform, of our product, can't just be managing the open-source project. That's not enough. And you're rightly calling out that's not enough. If the main value you're providing on top of open source is managing it, then you may have to do things like protect yourself with various licensing [strategies] and so forth.
But even that is not that strong of a defense, I think, and fundamentally we have to add tremendous value on top of the open-source project. There are a couple main dimensions where we can go very, very deep.
One of those is fundamentally around performance. And when I say performance, think of that as a big bucket, including cost and scalability and reliability and all of these kinds of things. That is something that people will pay for.
For example, we have customers spinning up, say, data processing workloads that run on many hundreds or thousands of CPU cores, thousands of [Nvidia] H100s that are running for four weeks at a time. If that crashes in the middle of the night five days in, it's just not a great place to be. It's very expensive, it slows you down. And so making this rock-solid infrastructure, that's something people will pay for.
Another dimension is velocity; really, time to market. How quickly can you ship AI products and iterate and build? This has to do with a lot of the tooling kits that we can provide around observability: If you run into a bug, how can you figure out what went wrong and how to fix it?
Are most of your customers running these workloads in the cloud?
Nishihara: Yeah. I would say one of our strengths is actually flexibility in deployment. We have customers that bring their own different, multiple cloud providers [and customers] that even bring their own on-prem GPUs that they purchased. So we have a mixture but the majority is on the cloud.
Would you compete then, with the clouds? Should they want to offer their own managed version of Ray, is that something that you've talked about with them?
Nishihara: That is happening. You can run Ray on [Google Cloud's] Vertex [AI], on GKE, on [Amazon] EKS, on AWS Glue. You can run Ray on Databricks, you can run Ray on Domino Data Lab. There are a lot of managed Ray services out there, and there are going to be more.
Melkote: I think it goes back to the differentiation point. I think they can also put a managed service on the open source and that's fine as an entry point. But if you're a customer that is serious about scaling, then all the other parts that we talked about become extremely important: Developer productivity, velocity — from experimentation to market performance, as Robert mentioned — and integrations, a lot of enterprise specific integrations become extremely important, both on the data side and observability. There's a whole bunch of tooling that's needed.
Think of the mass of developers that contributed to Ray, they all live in this building, right?
We're beginning to see that, and we are the experts. Think of the mass of developers that contributed to Ray, they all live in this building, right? And so it means that the expertise matters to these people who are scaling, and that's the customers we see, the customers that want to scale rather than have it as a small project. I think that's where the real opportunities [will be], and that's where we're focusing.
Having said that, we are going to be partnering with all these cloud providers. They are a pretty significant go-to-market for us, and so we'll have to work through that.
Nishihara: I think it'd be a different story if the challenges that we're trying to solve were easy challenges to solve. But there's so much depth in scaling AI workloads, and the scale is growing every year. The systems and infrastructure challenges are growing harder and harder, and they're going to keep getting harder.
You have a lot of people working with text data today, like with LLMs. But that's all going to be multi-modal data. So think about images and video, which are just vastly bigger than the text. So all of these AI workloads are going to become much more data intensive. That's a new requirement. That's a new challenge for these systems.
You're going to have more accelerators. Today you have Nvidia GPUs, but you're seeing a lot of promise with [AWS] Inferentia, with [Google Cloud's] TPUs, and AMD and so forth. The number of accelerators is going to explode that developers want to take advantage of. There are many things about the systems infrastructure challenges that are getting harder, and so there's a tremendous amount of depth here to add value. And I think that's why we think there's an opportunity to build a big business here, because the challenges are so hard.