Pinecone's new serverless architecture hopes to make the vector database more versatile

Pinecone, one of the leading vector database startups to emerge during the generative AI boom, thinks it has identified some of the roadblocks that companies need to clear before turning their generative AI experiments into production.

Pinecone announced Tuesday that it will begin rolling out the second generation of the serverless architecture for its flagship vector database over the next several months. The new version was designed to automatically make the right configuration decisions for a wider variety of application types, such as recommendation engines and agentic systems, without compromising on speed or cost.

"What we see in the market today is that people use vector databases for very, very different kinds of workloads, and they expect their database to be out-of-the-box responsive and performant for all the different kinds of workloads that you have in front of you," said Pinecone co-founder and CEO Edo Liberty in a recent interview with Runtime. Liberty and I will be discussing how companies are preparing their data for the AI era at the HumanX conference in Las Vegas on March 12th, along with our fellow panelists Shannon Scott of Airwallex, Mike Murchison of Ada, and Barr Moses of Monte Carlo.

The new database comes a little more than a year after Pinecone, which has raised $138 million in funding, introduced the first generation of its serverless architecture. That version relieved Pinecone customers of the burden of having to configure the computing resources needed to handle their workloads, while the goal for the new version was to address another operational challenge through a new system for building indexes, or collections of files within a database.

Point the way

Vector databases store information as vectors, which contain not only the data itself but information about how that particular piece of data relates to other pieces of data in the system. That makes them ideal for apps that tap into large-language models to answer questions, since they are able to quickly analyze similarities between the words in the prompt and the model's training data.

Pinecone customers are using its database for several different types of AI apps, such as building recommendation engines, rolling out keyword search tools, and experimenting with agentic systems amid an enormous push from enterprise software companies, Liberty said. But each one of those apps needs something a little different from a vector database.

Can we design something that is adaptive and can be able to actually handle all of it? It took us a very long time to do that, but we have done it.

For example, recommendation engines need to quickly read information from a database, but when speed is the priority the database tends to worry more about serving existing information rather than updating the index with new information, which can make it stale as new queries come in. On the other hand, generative AI search tools for retrieving documents inside a company have to be updated constantly as new documents are created, but rebuilding the index that often forces the app to spend computing resources.

Pinecone could handle all those needs, but customers needed to choose the right algorithm and configure the settings themselves to make it work, and most of them aren't really interested in becoming vector database experts. "Can we design something that is adaptive and can be able to actually handle all of it? It took us a very long time to do that, but we have done it," Liberty said.

The new version balances between speed and freshness by first focusing on speed, and then automatically detecting when the size of the index has hit a threshold that requires it to be rebuilt. At that point it merges small collections of files into larger ones, which spreads the cost of recreating the index out over time and allows companies to re-index their data more frequently.

"With the exact same API, with the exact same interface, you as a user don't have to choose the operating points" in advance or as the application's needs change, Liberty said. "We believe we're the only system that can actually do this now."

Depth versus breadth

While the generative AI hype meter has come back down to earth somewhat since Pinecone raised a $100 million funding round in April 2023, vector databases are still in demand. Only 20% of respondents to Retool's State of AI survey in late 2023 were using vector databases, but by the time the same report came out six months later in June 2024, 63.6% had started kicking the tires.

But that report illustrated a potential problem for Pinecone that enterprise tech buyers and sellers have been wondering about over the last few years. As database vendors of all stripes rushed to add vector capabilities to their databases in response to the AI boom, it wasn't necessarily clear why anyone would want to bet on a standalone vector database run by a startup when they were already using other databases sold and maintained by vendors with decades of enterprise experience.

"Vector is really not a data model, it's a data type. And when you look at databases over time, a lot of these data types are just added to the databases over time with an ability to index and then query those data types," Google's Andi Gutmans rold Runtime last year. Pinecone was the most popular vector database in Retool's 2023 survey, but last year it slipped to third behind database giant MongoDB and the open-source Postgres database.

a picture of pinecone ceo edo liberty sitting on a chair wearing a gray jacket, white shirt, and dark pants — Pinecone founder and CEO Edo Liberty (Credit: Pinecone)

Not surprisingly, Liberty argued that companies selling older types of databases have been saying that for decades as new types of databases emerge, only to watch customers flock to them; AWS currently offers 15 different types, not even counting what's available in the AWS Marketplace.

"A lot of people who go through this motion … at some point they need to scale up and go to production. And when you do that, you run into one or two difficulties; either the performance, cost [or] functionality that you need does not exist in those other vector databases," Liberty said.

Pinecone hopes the new update can help win over customers by providing a balanced approach to those needs. But Liberty also argued that vector databases are here to stay because they have the potential to unlock enormous value from all the unstructured data — emails, documents, and images — that most companies have lying around untouched.

"People used to talk about web scale as being big," said Liberty, a veteran of AWS and Yahoo. "But the ratio between how much stuff you put on the web versus how many emails you've got, how many text messages you wrote, how many pictures you have on your phone, how many meetings you've taken and how many sentences you've spoken and heard, how many documents you signed and saved … all of that data sits somewhere, and now it's actionable."