Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Dylan Patel

๐Ÿ‘ค Speaker
See mentions of this person in podcasts
3551 total appearances

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Kind of like, yes, adding the human... Designing the perfect Google button. Google's famous for having people design buttons that are so perfect. And it's like, how is AI going to do that? Like, they could give you all the ideas, but...

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And humans are actually very good at reading or judging between two things. This goes back to the core of what RLHF and preference tuning is, is that it's hard to generate a good answer for a lot of problems, but it's easy to see which one is better. And that's how we're using humans for AI now, is judging which one is better. And that's what software engineering could look like.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And humans are actually very good at reading or judging between two things. This goes back to the core of what RLHF and preference tuning is, is that it's hard to generate a good answer for a lot of problems, but it's easy to see which one is better. And that's how we're using humans for AI now, is judging which one is better. And that's what software engineering could look like.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And humans are actually very good at reading or judging between two things. This goes back to the core of what RLHF and preference tuning is, is that it's hard to generate a good answer for a lot of problems, but it's easy to see which one is better. And that's how we're using humans for AI now, is judging which one is better. And that's what software engineering could look like.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

The PR review, here's a few options. Here are some potential pros and cons. And they're going to be judges.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

The PR review, here's a few options. Here are some potential pros and cons. And they're going to be judges.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

The PR review, here's a few options. Here are some potential pros and cons. And they're going to be judges.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

I'll explain what a Tulu is. A Tulu is a hybrid camel when you breed a dromedary with a Bacchian camel. Back in the early days after Chachipiti, there was a big wave of models coming out like Alpaca, Vicuna, et cetera, that were all named after various mammalian species. So Tulu, the brand is multiple years old, which comes from that. And

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

I'll explain what a Tulu is. A Tulu is a hybrid camel when you breed a dromedary with a Bacchian camel. Back in the early days after Chachipiti, there was a big wave of models coming out like Alpaca, Vicuna, et cetera, that were all named after various mammalian species. So Tulu, the brand is multiple years old, which comes from that. And

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

I'll explain what a Tulu is. A Tulu is a hybrid camel when you breed a dromedary with a Bacchian camel. Back in the early days after Chachipiti, there was a big wave of models coming out like Alpaca, Vicuna, et cetera, that were all named after various mammalian species. So Tulu, the brand is multiple years old, which comes from that. And

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

We've been playing at the frontiers of post-training with open source code. And this first part of this release was in the fall where we built on Lama's open models, open weight models, and then we add in our fully open code, our fully open data. There's a popular benchmark that is Chatbot Arena, and that's generally the metric by which how these chat models are evaluated.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

We've been playing at the frontiers of post-training with open source code. And this first part of this release was in the fall where we built on Lama's open models, open weight models, and then we add in our fully open code, our fully open data. There's a popular benchmark that is Chatbot Arena, and that's generally the metric by which how these chat models are evaluated.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

We've been playing at the frontiers of post-training with open source code. And this first part of this release was in the fall where we built on Lama's open models, open weight models, and then we add in our fully open code, our fully open data. There's a popular benchmark that is Chatbot Arena, and that's generally the metric by which how these chat models are evaluated.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And it's humans compare random models from different organizations. And if you looked at the leaderboard in November or December, among the top 60 models from 10s to 20s of organizations, none of them had open code or data for just post-training. Among that, even fewer or none have pre-training data and code available, but post-training is much more accessible at this time.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And it's humans compare random models from different organizations. And if you looked at the leaderboard in November or December, among the top 60 models from 10s to 20s of organizations, none of them had open code or data for just post-training. Among that, even fewer or none have pre-training data and code available, but post-training is much more accessible at this time.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And it's humans compare random models from different organizations. And if you looked at the leaderboard in November or December, among the top 60 models from 10s to 20s of organizations, none of them had open code or data for just post-training. Among that, even fewer or none have pre-training data and code available, but post-training is much more accessible at this time.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

It's still pretty cheap and you can do it. And the thing is, how high can we push this number where people have access to all the code and data? So that's kind of the motivation of the project. We draw on lessons from Lama. NVIDIA had a Nemotron model where the recipe for their post-training was fairly open with some data and a paper.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

It's still pretty cheap and you can do it. And the thing is, how high can we push this number where people have access to all the code and data? So that's kind of the motivation of the project. We draw on lessons from Lama. NVIDIA had a Nemotron model where the recipe for their post-training was fairly open with some data and a paper.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

It's still pretty cheap and you can do it. And the thing is, how high can we push this number where people have access to all the code and data? So that's kind of the motivation of the project. We draw on lessons from Lama. NVIDIA had a Nemotron model where the recipe for their post-training was fairly open with some data and a paper.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And it's putting all these together to try to create a recipe that people can fine-tune models like GPT-4 to their domain.