Putting the servers back in serverless (kinda)

Welcome to Runtime! Today: Vercel unveils a new serverless computing architecture that's better equipped to manage idle resources, nobody knows what Elon Musk's minions are doing to the federal government's servers, and the latest funding rounds in enterprise tech.

(Was this email forwarded to you? Sign up here to get Runtime each week.)

Fluid dynamics

As companies struggle to deploy apps built around large-language models, they're also exposing inefficiencies in cloud tools that were designed to run older workloads. After rewriting the infrastructure beneath its serverless computing service last year, Vercel is ready to shift its customers onto a new platform that will make it cheaper to run AI apps.

Fluid Compute is a new architecture for Vercel Functions that was designed to eliminate the idle period when an AI app is waiting for a model to answer a question — which can take seconds or even minutes on computing infrastructure used to operating in milliseconds — and costs real money. In an exclusive interview with Runtime, Vercel co-founder and CEO Guillermo Rauch described Fluid Compute as the natural evolution of serverless computing.

"Fluid Compute sets out to fix serverless for the AI era," Rauch said.
AWS introduced the principles behind serverless computing back in 2014 with the launch of Lambda.
Apps built around Lambda and other serverless development platforms use functions that execute distinct tasks in response to external triggers, which allows computing resources to spin up and shut down very quickly.
At that time developers were obsessed with speed, having realized that their users and customers wouldn't tolerate sites and apps that ran even 100 milliseconds or so slower than what they expected, and "we optimized the world's compute for that [problem]," Rauch said.

Now there's an entirely different set of expectations around apps that work with LLMs, given that concerns about accuracy make it harder for users and developers to trust those apps. "Even the customer wants the back end to be slow," Rauch said, pointing to a new feature in OpenAI's ChatGPT that allows the user to ask it to "use more intelligence" to answer a prompt, which takes longer to run.

But as Vercel's customers started using the serverless platform to build AI apps, they realized they were wasting computing resources while awaiting a response from the model.
Traditional servers understand how to manage idle resources, but in serverless platforms like Vercel's "the problem is that you have that computer just waiting for a very long time and while you're claiming that space of memory, the customer is indeed paying," Rauch said.
Fluid Compute gets around this problem by introducing what the company is calling "in-function concurrency," which "allows a single instance to handle multiple invocations by utilizing idle time spent waiting for backend responses," Vercel said in a blog post last October announcing a beta version of the technology.
"Basically, you're treating it more like a server when you need it," Rauch said.

Rauch is hopeful that Fluid Compute will reduce the number of customers shocked by the size of their Vercel bills after their apps went viral or saw an unexpected surge in demand. That experience has been even more painful for AI app developers, who found they were paying more than they expected to serve their users with a slow app.

"Fluid addresses a huge percentage of those cases," Rauch said. "Developers felt like they weren't in control of that back end becoming slower, and Fluid brings into a world of predictability where you're concerned about what you do control, which is your code and the things that you ship."
The new platform could also make Vercel a more interesting option for larger enterprises that like the principles of serverless computing but need to make sure they're operating as efficiently as possible.
"Typically, Vercel has been seen by many as for front-end workloads." Rauch said. "With Fluid, you can run any kind of back-end workload as long as it's in those runtimes like Node and Python. It's not just the ability to run it, but to do so efficiently."

Read the rest of the full story on Runtime here.

Pretty dodgy

A group of tech workers under the spell of Elon Musk forced its way into the servers of several U.S. government agencies over the last several days, and nobody seems to have any idea what they are actually doing. Some agencies that have been affected as of Tuesday include the Office of Personnel Management, which has records on more than 3 million government employees, and the Department of Treasury, which is in charge of the money.

Musk's DOGE working group is apparently plugging outside servers into agency networks and in some cases rewriting code bases, according to Wired. 404 Media was able to determine that the group is removing all references to "forbidden words" that describe race and gender from the servers of the Office of Head Start, which was reporting funding problems as of Tuesday afternoon that could impact groups providing day care and other services to young children around the country.

“This has the potential to be the largest breach [of government systems] ever by orders of magnitude and could have consequences for decades,” Jason Kikta, a former U.S. Cyber Command official, told Recorded Future News. Even if the group treats all the sensitive data they have accessed with care — which is impossible to believe — making all those code changes at once could create any number of vulnerabilities for foreign actors to exploit after the minions get bored and move on to the next agency.

Enterprise funding

DataBank raised $250 million in new equity funding as it builds out a network of data centers and managed enterprise computing services.

ElevenLabs scored $180 million in Series C funding to expand its lineup of text-to-speech services.

Finout landed $40 million in Series C funding for its cloud cost-management software, which has taken on new importance as companies experiment with AI apps that follow new cost patterns.

Riot raised $30 million in Series B funding as it expands its employee cybersecurity software to international markets.

Anchor landed $20 million in Series A funding for its billing software, which helps small and medium-sized companies automate their accounts receivable processes.

Tana scored $14 million in Series A funding and launched its "knowledge graph," which creates to-do lists for a company's employees after reading their notes and listening to their conversations.

The Runtime roundup

Google Cloud growth came in lighter than expected, and parent company Alphabet's stock fell nearly 8% in after-hours trading.

Salesforce cut 1,000 jobs but has announced plans to hire around as many people to sell its AI-related services, Bloomberg reported.

Sailpoint announced plans to go public — again — after it was acquired in 2022 by Thoma Bravo, which will almost double its investment if Sailpoint attains its goal of an $11.5 billion valuation.

AMD missed Wall Street estimates for data-center revenue, but announced plans to accelerate the launch of its next-generation data-center GPU.

Thanks for reading — see you Thursday!