AWS previews re:Invent; AI2's latest open-source AI

Presented by:

Welcome to Runtime! Today: As is tradition, AWS released all the news that won't make the re:Invent keynote ahead of time, the Allen Institute for AI introduces a powerful and truly open-source AI model, and the quote of the week.

(Was this email forwarded to you? Sign up here to get Runtime each week.)

Ship it

AWS "pre:Invent": It's almost time for AWS's largest event of the year, and despite the marathon speeches spread out across four days of AWS re:Invent there are always several announcements the company trickles out in the days leading up to the show that apparently didn't make the keynote cut. The Unofficial AWS News Feed does a great job of rounding up the dozens of updates and enhancements that have been released over the past couple of weeks, but a few stood out.

After years of complaints about its somewhat-hostile user interface, AWS is planning to upgrade the console in 2025 with a cleaner design. And S3 Express One Zone customers will be able to append new data to an existing object without having to overwrite the entire thing, which could have interesting implications for real-time applications or log data.

Microsoft hardware: Microsoft's last big event of the calendar year focused mainly on AI, because of course it did, but the company also introduced two interesting new additions to its family of custom data-center silicon. While the launch date was not specified, Microsoft Azure customers will soon be able to run workloads on the Azure Boost DPU, which borrows from a concept popularized by Nvidia that offloads demanding networking tasks on a specialized chip.

Microsoft also announced the Azure Integrated Hardware Security Module, which will work behind the scenes. “Azure Integrated HSM will be installed in every new server in Microsoft’s data centers starting next year to increase protection across Azure’s hardware fleet for both confidential and general-purpose workloads,” Microsoft said, according to TechCrunch.

Open for business: So long as Meta plays fast and loose with the concept of "open-source AI," the term doesn't seem like it will carry as much weight in the generative AI era as it did during the cloud buildout over the last decade. But the Allen Institute for AI, or AI2, is committed to developing AI models that researchers can pick apart to understand their inner workings, and unveiled a new model this week that appears to be competitive with some of the best closed-source models.

The OpenScholar model, built in collaboration with the University of Washington, was designed "to help scientists effectively navigate and synthesize scientific literature," AI2's Akari Asai said in a blog post. The model outperformed OpenAI's GPT-4 and Meta's Llama 3.1 70B, and "to our knowledge, this is the first open release of a complete pipeline for a scientific assistant LM, from data to training recipes to model checkpoints," Asai wrote.

Howdy partners: They're often overlooked, but any enterprise tech company that has reached a certain size will tell you that partners and resellers are an extremely important part of how they sell their technology to business customers. That's especially true in the generative AI era, which doesn't come as naturally to a lot of businesses as regular application development, and Google Cloud introduced a new service this week to help partners build AI agents for end customers.

AI Agent Space will be a section of the Google Cloud Marketplace that lets companies find AI agents built by Google's partners, which include consulting firms and system integrators. Google will also incentivize partners to create AI agents with "product support, marketing amplification, and co-selling opportunities," it said in a blog post.

AI in the BBQ heartland: After spinning off from Russia's Yandex earlier this year, Nebius is ready to bring its AI infrastructure services to the U.S. It announced plans this week to launch a GPU cluster in Kansas City, Miss., early next year with capacity to run up to 35,000 GPUs.

Nebius enters a crowded market for GPU services in the U.S., well behind specialized providers like CoreWeave and Lambda Labs as well as the Big Three cloud providers. It's betting on a unique infrastructure stack built around Nvidia's technology that blends hardware and several internally developed managed services for observability and Apache Spark.

A MESSAGE FROM HEROKU

Twelve-Factor app definition is now open source and looking for your perspective to update the factors, examples, and reference implementations.

Read the blog.

Stat of the week

The best time to start planning for a serious cybersecurity incident was yesterday, because bouncing back takes most companies longer than they think. The average company thinks it will take 5.85 months to recover from an incident by restoring backups and implementing stronger policies, but it actually takes 7.34 months, according to a survey conducted by Fastly.

Quote of the week

"I don't think there's going to be a huge amount of model companies. I think the reality is setting that one, it is expensive to build models, and second, it is harder to even monetize, right?" — ServiceNow's Amit Zavery, asked about looming consolidation among AI model startups, seems to be waiting for the dust to settle.

The Runtime roundup

AWS plowed $4 billion more into Anthropic, which in return declared AWS "our primary cloud and training partner" and pledged to collaborate on the development of future versions of its Trainium AI processor.

AT&T and Broadcom settled their dispute over VMware pricing changes, but details were not released.