Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Arvind Narayanan

👤 Person
528 total appearances

Appearances Over Time

Podcast Appearances

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: AI Scaling Myths: More Compute is not the Answer | The Core Bottlenecks in AI Today: Data, Algorithms and Compute | The Future of Models: Open vs Closed, Small vs Large with Arvind Narayanan, Professor of Computer Science @ Princeton

So your training cost increases, your inference cost decreases. But because it's the inference cost that dominates, the total cost is probably going to come down. So total cost comes down. If you have the same workload and you have a smaller model doing it, then the total cost is going to come down.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: AI Scaling Myths: More Compute is not the Answer | The Core Bottlenecks in AI Today: Data, Algorithms and Compute | The Future of Models: Open vs Closed, Small vs Large with Arvind Narayanan, Professor of Computer Science @ Princeton

So your training cost increases, your inference cost decreases. But because it's the inference cost that dominates, the total cost is probably going to come down. So total cost comes down. If you have the same workload and you have a smaller model doing it, then the total cost is going to come down.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: AI Scaling Myths: More Compute is not the Answer | The Core Bottlenecks in AI Today: Data, Algorithms and Compute | The Future of Models: Open vs Closed, Small vs Large with Arvind Narayanan, Professor of Computer Science @ Princeton

Sure. I think we are still in a period where, you know, these models have not yet quite become commoditized. There's obviously a lot of progress and there's a lot of demand on hardware as well. Hardware cycles are also improving rapidly. But, you know, there's the saying that every exponential is a sigmoid in disguise. So a sigmoid curve is one that looks like an exponential at the beginning.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: AI Scaling Myths: More Compute is not the Answer | The Core Bottlenecks in AI Today: Data, Algorithms and Compute | The Future of Models: Open vs Closed, Small vs Large with Arvind Narayanan, Professor of Computer Science @ Princeton

Sure. I think we are still in a period where, you know, these models have not yet quite become commoditized. There's obviously a lot of progress and there's a lot of demand on hardware as well. Hardware cycles are also improving rapidly. But, you know, there's the saying that every exponential is a sigmoid in disguise. So a sigmoid curve is one that looks like an exponential at the beginning.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: AI Scaling Myths: More Compute is not the Answer | The Core Bottlenecks in AI Today: Data, Algorithms and Compute | The Future of Models: Open vs Closed, Small vs Large with Arvind Narayanan, Professor of Computer Science @ Princeton

Sure. I think we are still in a period where, you know, these models have not yet quite become commoditized. There's obviously a lot of progress and there's a lot of demand on hardware as well. Hardware cycles are also improving rapidly. But, you know, there's the saying that every exponential is a sigmoid in disguise. So a sigmoid curve is one that looks like an exponential at the beginning.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: AI Scaling Myths: More Compute is not the Answer | The Core Bottlenecks in AI Today: Data, Algorithms and Compute | The Future of Models: Open vs Closed, Small vs Large with Arvind Narayanan, Professor of Computer Science @ Princeton

So imagine the S letter shape. But then after a while, it has to taper off like every exponential has to taper off. So I think that's going to happen both with models as well as with these hardware cycles. We are, I think, going to get to a world where models do get commoditized.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: AI Scaling Myths: More Compute is not the Answer | The Core Bottlenecks in AI Today: Data, Algorithms and Compute | The Future of Models: Open vs Closed, Small vs Large with Arvind Narayanan, Professor of Computer Science @ Princeton

So imagine the S letter shape. But then after a while, it has to taper off like every exponential has to taper off. So I think that's going to happen both with models as well as with these hardware cycles. We are, I think, going to get to a world where models do get commoditized.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: AI Scaling Myths: More Compute is not the Answer | The Core Bottlenecks in AI Today: Data, Algorithms and Compute | The Future of Models: Open vs Closed, Small vs Large with Arvind Narayanan, Professor of Computer Science @ Princeton

So imagine the S letter shape. But then after a while, it has to taper off like every exponential has to taper off. So I think that's going to happen both with models as well as with these hardware cycles. We are, I think, going to get to a world where models do get commoditized.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: AI Scaling Myths: More Compute is not the Answer | The Core Bottlenecks in AI Today: Data, Algorithms and Compute | The Future of Models: Open vs Closed, Small vs Large with Arvind Narayanan, Professor of Computer Science @ Princeton

A big part of it is this issue of vibes, right? So you evaluate LLMs on these benchmarks, but then it seems to perform really well on the benchmarks, but then the vibes are off. In other words, you start using it and somehow it doesn't feel adequate. It makes a lot of mistakes in ways that are not captured in the benchmark.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: AI Scaling Myths: More Compute is not the Answer | The Core Bottlenecks in AI Today: Data, Algorithms and Compute | The Future of Models: Open vs Closed, Small vs Large with Arvind Narayanan, Professor of Computer Science @ Princeton

A big part of it is this issue of vibes, right? So you evaluate LLMs on these benchmarks, but then it seems to perform really well on the benchmarks, but then the vibes are off. In other words, you start using it and somehow it doesn't feel adequate. It makes a lot of mistakes in ways that are not captured in the benchmark.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: AI Scaling Myths: More Compute is not the Answer | The Core Bottlenecks in AI Today: Data, Algorithms and Compute | The Future of Models: Open vs Closed, Small vs Large with Arvind Narayanan, Professor of Computer Science @ Princeton

A big part of it is this issue of vibes, right? So you evaluate LLMs on these benchmarks, but then it seems to perform really well on the benchmarks, but then the vibes are off. In other words, you start using it and somehow it doesn't feel adequate. It makes a lot of mistakes in ways that are not captured in the benchmark.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: AI Scaling Myths: More Compute is not the Answer | The Core Bottlenecks in AI Today: Data, Algorithms and Compute | The Future of Models: Open vs Closed, Small vs Large with Arvind Narayanan, Professor of Computer Science @ Princeton

And the reason for that is simply that when there is so much pressure to do well on these benchmarks, developers are intentionally or unintentionally optimizing these models in ways that look good on the benchmarks, but don't look good in real world evaluation.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: AI Scaling Myths: More Compute is not the Answer | The Core Bottlenecks in AI Today: Data, Algorithms and Compute | The Future of Models: Open vs Closed, Small vs Large with Arvind Narayanan, Professor of Computer Science @ Princeton

And the reason for that is simply that when there is so much pressure to do well on these benchmarks, developers are intentionally or unintentionally optimizing these models in ways that look good on the benchmarks, but don't look good in real world evaluation.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: AI Scaling Myths: More Compute is not the Answer | The Core Bottlenecks in AI Today: Data, Algorithms and Compute | The Future of Models: Open vs Closed, Small vs Large with Arvind Narayanan, Professor of Computer Science @ Princeton

And the reason for that is simply that when there is so much pressure to do well on these benchmarks, developers are intentionally or unintentionally optimizing these models in ways that look good on the benchmarks, but don't look good in real world evaluation.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: AI Scaling Myths: More Compute is not the Answer | The Core Bottlenecks in AI Today: Data, Algorithms and Compute | The Future of Models: Open vs Closed, Small vs Large with Arvind Narayanan, Professor of Computer Science @ Princeton

So when GPT-4 came out and OpenAI claimed that it passed the bar exam and the medical licensing exam, people were very excited slash scared about what this means for doctors and lawyers. And the answer turned out to be approximately nothing. Because it's not like a lawyer's job is to answer bar exam questions all day.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: AI Scaling Myths: More Compute is not the Answer | The Core Bottlenecks in AI Today: Data, Algorithms and Compute | The Future of Models: Open vs Closed, Small vs Large with Arvind Narayanan, Professor of Computer Science @ Princeton

So when GPT-4 came out and OpenAI claimed that it passed the bar exam and the medical licensing exam, people were very excited slash scared about what this means for doctors and lawyers. And the answer turned out to be approximately nothing. Because it's not like a lawyer's job is to answer bar exam questions all day.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: AI Scaling Myths: More Compute is not the Answer | The Core Bottlenecks in AI Today: Data, Algorithms and Compute | The Future of Models: Open vs Closed, Small vs Large with Arvind Narayanan, Professor of Computer Science @ Princeton

So when GPT-4 came out and OpenAI claimed that it passed the bar exam and the medical licensing exam, people were very excited slash scared about what this means for doctors and lawyers. And the answer turned out to be approximately nothing. Because it's not like a lawyer's job is to answer bar exam questions all day.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: AI Scaling Myths: More Compute is not the Answer | The Core Bottlenecks in AI Today: Data, Algorithms and Compute | The Future of Models: Open vs Closed, Small vs Large with Arvind Narayanan, Professor of Computer Science @ Princeton

These benchmarks that models are being tested on don't really capture what we would use them for in the real world. So that's one reason why LLM evaluation is a minefield. And there's also just a very simple factor of contamination. Maybe the model has already trained on the answers that it's being evaluated on in the benchmark. And so if you ask it new questions, it's going to struggle.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: AI Scaling Myths: More Compute is not the Answer | The Core Bottlenecks in AI Today: Data, Algorithms and Compute | The Future of Models: Open vs Closed, Small vs Large with Arvind Narayanan, Professor of Computer Science @ Princeton

These benchmarks that models are being tested on don't really capture what we would use them for in the real world. So that's one reason why LLM evaluation is a minefield. And there's also just a very simple factor of contamination. Maybe the model has already trained on the answers that it's being evaluated on in the benchmark. And so if you ask it new questions, it's going to struggle.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: AI Scaling Myths: More Compute is not the Answer | The Core Bottlenecks in AI Today: Data, Algorithms and Compute | The Future of Models: Open vs Closed, Small vs Large with Arvind Narayanan, Professor of Computer Science @ Princeton

These benchmarks that models are being tested on don't really capture what we would use them for in the real world. So that's one reason why LLM evaluation is a minefield. And there's also just a very simple factor of contamination. Maybe the model has already trained on the answers that it's being evaluated on in the benchmark. And so if you ask it new questions, it's going to struggle.