Preparing for tomorrow’s agentic workforce

The time is now to focus on AI infrastructure, which will enable companies to scale AI and build a future where humans and multiple AI agents successfully work together.

DOWNLOADS

Article (10 pages)

To effectively compete, companies must take a hard look at what they can do to support an AI infrastructure. On this episode of the At the Edge podcast, SambaNova Systems cofounder and CEO Rodrigo Liang joins host and McKinsey Senior Partner Lareina Yee to discuss agentic AI, the S-curve of AI value, and why businesses must adopt a hybrid AI model.

The following transcript has been edited for clarity and length.

Rethinking AI infrastructure

Lareina Yee: SambaNova is an ambitious, completely exciting AI company addressing an enormous market. Can you tell us what you originally saw in the marketplace that inspired you to start SambaNova?

Rodrigo Liang: I have two amazing cofounders, Stanford Professors Kunle Olukotun and Christopher Ré, and the three of us got together and really started thinking about this worldwide transformation we’re going through. If you think about this AI-first, AI-centric world we’re building, it’s ultimately driving a scale of transformation we’ve only seen a few times over the last two or three decades.

So the genesis of SambaNova came from this brainstorming process to see if the computing infrastructure we’re running on was really the most efficient. And the conclusion, based on Stanford research, is that there are significantly better ways to enable AI. That’s when we decided to embark on this journey seven and a half years ago.

Lareina Yee: Seven years ago, there were some of us, myself included, who were super interested in data centers. Today, it’s become a hot topic, and everyone’s talking about infrastructure.

McKinsey calculates a roughly $5 trillion investment needed over the next five years to build all the data center infrastructure, including buildings, software, cooling systems, and energy plants, to power AI’s voracious appetite. How do you think about the dynamics of the cost, the innovation, and the moment we’re in?

Apple Podcasts Spotify YouTube

Rodrigo Liang: There are three things I think are incredibly important for us to think about as we’re building out the scale.

In the last three years, we’ve already seen an incredible build-out of GPUs [graphics processing units], AI infrastructure, and teraflops [floating-point operations per second]. Most of this build-out has been for pretraining large models and is really dominated by the largest players in the world. But as you move forward, you’re seeing a world that wants to do inference, do test-time computing, and all these different things requiring the models we’ve trained.

But as we scale up, we’re now seeing other constraints start to appear, like a lack of sufficient power for these data centers. So people are talking about nuclear power plants and other sources of energy. But then you have to figure out how to get the cooling done as well.

And as you think about energy, you’ll also need to figure out how to update your entire grid to power those gigawatt data centers. And eventually, you’ve got to get all of that back-connected to where the users are, which is mainly in these large metropolitan areas—which is not where you’re going to put your gigawatt data center.

So, there are a lot of infrastructure challenges we have to figure out, and at SambaNova, we’re very focused on making it all easier. We’re dedicated to figuring out how to deliver the capacity you need at a fraction of the cost and a fraction of the power.

We all need to contribute to the solution because the answer can’t be, “Just build more power plants and build more data centers.” It’s too hard. You will need those, but the core tech also needs to be significantly more efficient.

Bookending the technology stack

Lareina Yee: Tell us about some of that magic sauce around efficiency that SambaNova is working on. If I’m an average layperson thinking about this, how do I understand the important role you play in this ecosystem?

Rodrigo Liang: Think of SambaNova as bookends on the technology stack. On the one hand, we build chips, and on the other, we create API services that allow the best open-source models to be accessed without having to actually invest in all of this custom-model work.

With SambaNova, you can go to cloud.sambanova.ai and use all the best open-source models, with all the benefits and full accuracy, with world-record speeds at a very efficient cost. Because as soon as you actually deploy AI, the cost of infrastructure acquisition, power, networking, and all the things that are required starts adding up.

And if you’re going to go from this world of training to what I think is going to be a tenfold increase in investment for inferencing, you have to be more efficient. You have to make the cost come down. Otherwise, it won’t scale.

Planning for a hybrid model

Lareina Yee: Let’s just fast-forward and assume businesses will figure out how to scale AI. So, if I’m a business leader, how do I plan?

Rodrigo Liang: The companies that win will use AI to provide better services in the market, engaging with customers faster and better and making customization easier. They will also change their operations so AI can give them a significantly better time to market and a significantly better customer experience.

So, your AI solution is going to be a hybrid model. Just like you have cloud and on-premises, you’re going to have large language models [LLMs] and custom LLMs. You’re also going to have text, vision, language, and voice models.

When you run a company, you have your own custom methods to accomplish your various operational needs. But to fully embrace hybrid, the data will anchor where your AI models run, whether it’s in cloud A, cloud B, or on-premises.

That’s how we think about how you should deploy infrastructure. Let the data reign and drive the solution you need because you’re going to be hybrid anyway.

Preparing for tomorrow’s agentic workforce

DOWNLOADS

Rethinking AI infrastructure

Subscribe to the At the Edge podcast

Bookending the technology stack

Planning for a hybrid model

Leave a Reply