The hidden tax on AI infrastructure

Welcome to Runtime! Today: Why cloud storage architectures have an enormous impact on generative AI app performance, a vital component of cybersecurity preparedness is in limbo thanks to a cut in federal funding, and the latest funding rounds in enterprise tech.

(Was this email forwarded to you? Sign up here to get Runtime each week.)

Cache money

Ask most people what they think are the biggest bottlenecks holding back generative AI adoption, and they'll quote lead times on Nvidia's Blackwell GPUs or the lack of quality training data. But for most enterprises, rethinking storage strategies built around non-AI applications could unlock performance improvements and reduce costs without having to stand in line outside a TSMC fab.

Google Cloud rolled out several upgrades to the company's storage services designed with AI apps in mind last week at Google Cloud Next, according to Sachin Gupta, vice president and general manager of the company's Infrastructure and Solutions Group. Storage services are one of the most fundamental parts of any cloud architecture, and like their computing infrastructure cousins, are changing in response to changing AI workloads, he said in an interview last week.

"GPUs and TPUs [Google's version of the GPU] are constrained resources; they're expensive resources, and so effectively using those is incredibly important," Gupta said.
Those chips process data extremely quickly but are often forced to sit idle waiting for data stored in another cloud region or on-premises data center, and the meter is always running.
This problem affects the process of both training the AI models as well as inference, which can take a very long time by modern standards for application performance when using so-called "reasoning" models.
"Thinking about the right storage choice can save you a lot of cost and significantly improve both training and inferencing performance," Gupta said.

As long as the speed of light remains constant — which is a bet Runtime is going to take — latency is always going to affect the performance of any cloud workload, and AI workloads are especially sensitive to those delays. But companies can reduce latency by storing data closer to their computing engines (known as "goodput"), and Google's new Rapid Storage and Anywhere Cache services were designed to help companies get around some of the roadblocks created by traditional cloud computing designs.

Some quick background: cloud customers concerned about reliability often deploy their workloads in availability zones (AZs), which are hardened data centers within a broader computing region, but Google Cloud Storage archives data at the regional level.
Data moves quickly between AZs in a given region, but not as quickly as customers running big. expensive AI model-training jobs would like, Gupta said.
Rapid Storage will allow those customers to place their data in a new bucket in the AZ where their model is being trained, which Google said "provides industry-leading <1ms random read and write latency, 20x faster data access, 6 TB/s of throughput, and 5x lower latency for random reads and writes compared to other leading hyperscalers."
And Anywhere Cache is "the industry’s first consistent zonal cache," according to Blocks and Files, allowing customers to keep their data at the regional level where Google Cloud can "automatically detect the data that we need to cache, so we can reduce your latency by up to 70%," Gupta said.

We're starting to get a clearer picture of just how many cloud infrastructure concepts need to be rethought for the generative AI era, which is probably one of the main reasons why adoption has been slow. Techniques and strategies built up over 15 years of cloud computing are hard habits to break, and new best practices for building AI apps are just starting to come together.

But one difference this time around is that cost is much more top-of-mind among CIOs, who were just starting to feel better about spending money to upgrade their enterprise infrastructure before weeks of tariff chaos threatened to tip the economy into recession later this year.
Anything cloud providers can do to reduce operating costs (AWS cut the price of its S3 Express One Zone service, which is similar to Rapid Storage, by 31% the day after Google's storage announcements) will be welcome.
"There's just a bunch of stuff you need to think about: Am I getting the best performance, and am I getting the best cost, and am I utilizing my infrastructure correctly?" Gupta said.

Severity: Critical

Citing reduced funding from the federal government, the nonprofit MITRE organization confirmed Tuesday that its ability to maintain the widely used CVE (common vulnerabilities and exposures) system for tracking and rating cybersecurity issues will end tomorrow. According to NextGov/FCW, "funding for related programs run by the organization — such as the Common Weakness Enumeration program — will also expire tomorrow," MITRE said in a statement.

It's hard to overstate the chaos that letting the CVE database stagnate or disappear could cause inside cybersecurity organizations, which rely on its updates to keep their companies secure. "Every vulnerability management strategy around the world today is heavily dependent and structured around the CVE system and its identifiers," Ariadne Conill, co-founder and chief distinguished engineer at Edera, said in an email.

The news comes on the heels of big cutbacks to the budget of CISA, which had done as good a job as any government agency in recent years navigating the intersection of cybersecurity, software development, and national security. It seems likely that some other organization outside the U.S. will take over maintenance of the CVE system, but the next few months could be rocky.

Enterprise funding

Tessell raised $60 million in Series B funding for its managed database service, which helps companies run popular databases like Postgres and MySQL on AWS or Azure.

Portnox landed $37.5 million in Series B funding as it looks to expand the market for its cloud-based zero-trust security software.

Groundcover scored $35 million in Series B funding to take on the observability market with a product based around the open-source eBPF project.

Doss raised $18 million in Series A funding for its new approach to the stodgy ERP market, which involves (of course) AI.

NetRise landed $10 million in Series A funding to build out its software supply-chain security platform, which helps companies create an inventory of their existing assets.

The Runtime roundup

Several crypto exchanges running on AWS went down early Tuesday after "the primary and secondary power was interrupted to the affected EC2 instances" in its Tokyo region, according to DCD.

Meanwhile, Google Cloud confirmed power issues were the cause of a March outage, after the company's uninterruptible power supply inside an availability zone in its Ohio region was somehow interrupted.

HPE is going to have to get used to talking with Elliot Management Capital after the activist shareholder revealed it had taken a $1.5 billion stake in the company, which would make it one of its top five largest shareholders.

Microsoft signed infrastructure-improvement deals with a city in Ohio just two months before it canceled plans to build a data center there, according to Bloomberg, which suggests it had a rather abrupt change of heart.

Thanks for reading — see you Thursday!