Today: Vercel unveils a new serverless computing architecture that's better equipped to manage idle resources, nobody knows what Elon Musk's minions are doing to the federal government's servers, and the latest funding rounds in enterprise tech.
Vercel's serverless infrastructure was designed at a time when speed was the most important goal. AI apps are a little different, and Fluid Compute is an effort to rebuild that infrastructure for the AI era.
Today on Product Saturday: The Allen Institute for AI releases an actual open-source challenger to DeekSeek's V3 model, Microsoft open-sources a NoSQL database under an old and familiar name, and the quote of the week.
Why Vercel overhauled its serverless infrastructure for the AI era
Vercel's serverless infrastructure was designed at a time when speed was the most important goal. AI apps are a little different, and Fluid Compute is an effort to rebuild that infrastructure for the AI era.
As companies struggle to deploy apps built around large-language models, they're also exposing inefficiencies in cloud tools that were designed to run older workloads. After rewriting the infrastructure beneath its serverless computing service last year, Vercel is ready to shift its customers onto a new platform that will make it cheaper to run AI apps.
Fluid Compute is a new architecture for Vercel Functions that was designed to eliminate the idle period when an AI app is waiting for a model to answer a question — which can take seconds or even minutes on computing infrastructure used to operating in milliseconds — and costs real money. In an exclusive interview with Runtime, Vercel co-founder and CEO Guillermo Rauch described Fluid Compute as the natural evolution of serverless computing.
"Fluid Compute sets out to fix serverless for the AI era," Rauch said. It's an acknowledgement that tried-and-true computing infrastructure strategies can change very quickly when something like generative AI comes along, a broader topic that I'll be discussing with Rauch at the HumanX conference in March alongside fellow panelists Andrew Feldman of Cerebras, Robert Nishihara of Anyscale, and Sharon Zhou of Lamini AI.
Vercel, which according to Crunchbase has raised $563 million in funding, is primarily known for its web application development platform. Developers use Vercel's open-source Next.js framework and managed infrastructure services to quickly launch and run cloud apps without having to provision and configure their own hardware.
However, Vercel originally designed the infrastructure that powers its managed computing services to run traditional web apps. Fluid Compute is an effort to rebuild that infrastructure to process AI apps without changing anything about the way non-AI apps run.
AWS introduced the principles behind serverless computing back in 2014 with the launch of Lambda. Apps built around Lambda and other serverless development platforms use functions that execute distinct tasks in response to external triggers, which allows computing resources to spin up and shut down very quickly.
At that time developers were obsessed with speed, having realized that their users and customers wouldn't tolerate sites and apps that ran even 100 milliseconds or so slower than what they expected, and "we optimized the world's compute for that [problem]," Rauch said. Vercel's managed infrastructure runs on AWS and the company works closely with its Lambda team.
Basically, you're treating it more like a server when you need it.
But as Vercel's customers started using the serverless platform to build AI apps, they realized they were wasting computing resources while awaiting a response from the model. Traditional servers understand how to manage idle resources, but in serverless platforms like Vercel's "the problem is that you have that computer just waiting for a very long time and while you're claiming that space of memory, the customer is indeed paying," Rauch said.
Fluid Compute gets around this problem by introducing what the company is calling "in-function concurrency," which "allows a single instance to handle multiple invocations by utilizing idle time spent waiting for backend responses," Vercel said in a blog post last October announcing a beta version of the technology. "Basically, you're treating it more like a server when you need it," Rauch said.
Suno was one of Fluid Compute's beta testers, and saw "upwards of 40% cost savings on function workloads," Rauch said. Depending on the app, other customers could see even greater savings without having to change their app's configuration, he said.
Back is the new front
Fluid Compute was designed to work with Node.js and Python applications, which are two of the most widely used frameworks and programming languages (respectively) among professional developers surveyed by Stack Overflow. Cloudflare Workers is a rival serverless computing platform that uses a similar technology to more efficiently deal with idle requests, but it is based on a different runtime and Node,js developers have to implement a few workarounds to get their apps to run.
Rauch is hopeful that Fluid Compute will reduce the number of customers shocked by the size of their Vercel bills after their apps went viral or saw an unexpected surge in demand. That experience has been even more painful for AI app developers, who found they were paying more than they expected to serve their users with a slow app.
"Fluid addresses a huge percentage of those cases," Rauch said. "Developers felt like they weren't in control of that back end becoming slower, and Fluid brings into a world of predictability where you're concerned about what you do control, which is your code and the things that you ship."
The new platform could also make Vercel a more interesting option for larger enterprises that like the principles of serverless computing but need to make sure they're operating as efficiently as possible.
"Typically, Vercel has been seen by many as for front-end workloads." Rauch said. "With Fluid, you can run any kind of back-end workload as long as it's in those runtimes like Node and Python. It's not just the ability to run it, but to do so efficiently."
Tom Krazit has covered the technology industry for over 20 years, focused on enterprise technology during the rise of cloud computing over the last ten years at Gigaom, Structure and Protocol.
Today: Vercel unveils a new serverless computing architecture that's better equipped to manage idle resources, nobody knows what Elon Musk's minions are doing to the federal government's servers, and the latest funding rounds in enterprise tech.
AI use cases are becoming more powerful and pervasive across the software delivery lifecycle, but adopting any new technology comes with some risks. Nine members of our Roundtable discussed how technology leaders can reap the benefits of AI software-development tools while avoiding the pitfalls.
Even companies eager to jump on the GenAI bandwagon have struggled to organize their data and get past deployment hurdles, and nobody likes to spend all that time, effort, and money to build technology that can't be shipped because it can't be trusted.
AWS didn't ignore AI during Garman's presentation Tuesday, but it spent a significant amount of time on the services that turned it into a $100-billion a year enterprise computing powerhouse: compute, storage, and databases.