Today: As is tradition, AWS released all the news that won't make the re:Invent keynote ahead of time, the Allen Institute for AI introduces a powerful and truly open-source AI model, and the quote of the week.
This era of enterprise software is either the dawn of a new era of corporate productivity or the most hyped money pit since the metaverse. ServiceNow's Amit Zavery talks about the impact of generative AI, how SaaS companies should think about AI models, and his decision to leave Google Cloud.
How CoreWeave went all-in on Nvidia to take on Big Cloud
CoreWeave, which started out as a cryptocurrency mining operation, is taking a fresh approach to cloud services: It is focused on delivering the raw ingredients for the generative AI boom at extremely competitive prices.
It's not every day that you come across a startup willing and able to challenge the Big Three cloud infrastructure providers, but as a new AI spring dawns, CoreWeave is taking a shot.
AWS, Microsoft, and Google Cloud have spent the last decade building out an incredible array of cloud computing services and enormous data centers designed to replicate nearly anything potential customers could have done on their own. CoreWeave, which started out as a cryptocurrency mining operation, is taking the exact opposite approach to cloud services: It is focused on delivering the raw ingredients for the generative AI boom at extremely competitive prices.
"When the Big Three are building a cloud region, they're building to serve the hundreds of thousands or millions of what I would call generic use cases for their user base, and in those regions they may only have a small portion of capacity peeled off for GPU compute," said Brian Venturo, chief technology officer at CoreWeave, in a recent interview. "What really should be a first-class workload in those environments is kind of merely like a capacity planning afterthought."
"Now we're in a position (where) we're building for some of the largest AI labs on the planet at scale, and other cloud providers just can't do it as fast as we can," Venturo said. "It's been pretty wild."
CUDA shoulda
Nvidia's GPUs have been the engine for the two biggest tech booms of the last decade: cryptocurrencies and AI.
Way back in 2007, the chip company had the foresight to develop the CUDA programming model to make it easier to write software for its GPUs, which at the time were primarily used for gaming. However, it became clear over the next several years that GPUs were great tools for executing specific types of programs over and over again in parallel, compared to CPUs from Intel and AMD, which were designed to anticipate a wide variety of computing needs.
"I'm very convinced that Nvidia's open ecosystem around CUDA and AI Enterprise is a huge moat for the Nvidia platform, just in that so much work has been done on it," Venturo said. "There are so many more developers that are fluent and building on top of Nvidia than there are even on AMD or on (Google's) TPU or AWS's training and accelerator (chips)."
Cryptocurrency miners realized they could use GPUs to get in on the ground floor of the Bitcoin — and later Ethereum — mining boom, and CoreWeave began selling silicon picks and shovels to that frenzy.
"In 2016, we bought our first GPU, plugged it in, sat it on a pool table in a lower Manhattan office overlooking the East River, and mined our first block on the Ethereum network," wrote CoreWeave CEO Michael Intrator in a 2021 blog post outlining the company's history. At the time, Intrator and Venturo worked for a New York-based investment company called Hudson Ridge Asset Management, which bet on natural gas futures and appears to no longer be active.
As CoreWeave began to build a stable of GPU hardware in a garage in suburban New Jersey to expand its mining operation — which at one point was the largest Ethereum operation in the U.S. — it started hearing from other companies that wanted access to GPUs but couldn't afford to pay Big Cloud prices, Venturo said.
"We were approached by a friendly associate (who) said, 'Hey, we know you have a lot of compute capacity; I have a friend who needs to run inference for their text adventure game,'" Venturo said. "It became very apparent very quickly that folks didn't have access to scale GPU infrastructure to run what I would call load-following-type workloads," or workloads that can ramp up and down rapidly as demand changes.
Nvidia also took notice, and struck up a partnership with CoreWeave. That relationship helped the startup secure access to the precious GPUs needed for machine-learning training and inference workloads right as the crypto boom faded and Ethereum completed "The Merge," which made mining compute power irrelevant.
Boomtown
Now with more than 150 employees, CoreWeave is focused on building cloud infrastructure for startups and private AI labs, with three data centers in operation in New Jersey, Las Vegas, and Chicago. The company currently has 1,300 customers, up from about 300 at the same time last year, Venturo said.
CoreWeave's main business consists of renting GPUs by the hour, including the newest Nvidia H100 GPUs (which can be hard to find) but also older versions that cost far less to run per hour. The company will build custom private infrastructure for larger customers, while others rent GPUs (and some traditional compute) on bare-metal servers managed entirely by CoreWeave.
Large AI institutions or multinational enterprise customers probably won't find CoreWeave's infrastructure sufficient for their needs, but Venturo said the company is content servicing customers for whom price and responsiveness is paramount. According to an analysis prepared by Andreessen Horowitz, CoreWeave's pricing is well below what customers can expect to pay for GPUs from AWS, Microsoft, Google, and even Oracle, which has been aggressively courting price-conscious AI customers.
That approach is paying off right as demand for generative AI technologies and research exploded in the early part of this year.
"Two months ago, a company may not have existed, and now they may have $500 million of venture capital funding. And the most important thing for them to do is secure access to compute; they can't launch their product or launch their business until they have it," Venturo said. "Our organization has been built to move at the same speed, with the same urgency as those folks do."
Tom Krazit has covered the technology industry for over 20 years, focused on enterprise technology during the rise of cloud computing over the last ten years at Gigaom, Structure and Protocol.
Despite recent challenges to their hegemony, x86 chips still power the vast majority of cloud and on-premises servers in use today. However, over all those years Intel and AMD tweaked x86 in subtle but incompatible ways to suit their own needs, and Tuesday's agreement is a promise to unify x86.
This week a U.K. regulatory agency published summaries of hearings it conducted this past July with AWS, Microsoft, and Google Their responses provide an interesting look into how the cloud providers see themselves, their competitors, and the current state of the market.
For years, Oracle tried to convince longtime database customers who wanted to shed their on-premises data centers to run those databases on Oracle's public infrastructure cloud, slamming AWS at every turn. Times have changed.
A generation of cloud architects, developers, and systems engineers has stayed loyal to AWS over nearly two decades in part because of its reputation for supporting anything it launched that was used by a customer to build their infrastructure. That commitment appears to be changing.