Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Dylan Patel

๐Ÿ‘ค Speaker
See mentions of this person in podcasts
3551 total appearances

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

For math, it might not be as effective, but if you take a 1 billion parameter model, so something 600 times smaller than DeepSeq, you can boost its grade school math scores very directly with a small amount of this training. So it's not to say that this is coming soon.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

For math, it might not be as effective, but if you take a 1 billion parameter model, so something 600 times smaller than DeepSeq, you can boost its grade school math scores very directly with a small amount of this training. So it's not to say that this is coming soon.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Setting up the verification domains is extremely hard and there's a lot of nuance in this, but there are some basic things that we have seen before where it's at least expectable that there's a domain and there's a chance that this works.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Setting up the verification domains is extremely hard and there's a lot of nuance in this, but there are some basic things that we have seen before where it's at least expectable that there's a domain and there's a chance that this works.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Setting up the verification domains is extremely hard and there's a lot of nuance in this, but there are some basic things that we have seen before where it's at least expectable that there's a domain and there's a chance that this works.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Something I would say about these reasoning models is we talked a lot about reasoning training on math and code. And what is done is that you have the base model we've talked about a lot on the internet. You do this large scale reasoning training with reinforcement learning.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Something I would say about these reasoning models is we talked a lot about reasoning training on math and code. And what is done is that you have the base model we've talked about a lot on the internet. You do this large scale reasoning training with reinforcement learning.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Something I would say about these reasoning models is we talked a lot about reasoning training on math and code. And what is done is that you have the base model we've talked about a lot on the internet. You do this large scale reasoning training with reinforcement learning.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And then what the DeepSeek paper detailed in this R1 paper, which for me is one of the big open questions on how do you do this, is that they did... reasoning-heavy but very standard post-training techniques after the large-scale reasoning RL.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And then what the DeepSeek paper detailed in this R1 paper, which for me is one of the big open questions on how do you do this, is that they did... reasoning-heavy but very standard post-training techniques after the large-scale reasoning RL.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And then what the DeepSeek paper detailed in this R1 paper, which for me is one of the big open questions on how do you do this, is that they did... reasoning-heavy but very standard post-training techniques after the large-scale reasoning RL.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So they did the same things with a form of instruction tuning through rejection sampling, which is essentially heavily filtered instruction tuning with some reward models. And then they did this RLHF, but they made it math-heavy. So some of this transfer, we looked at this philosophical example early on, one of the big open questions is how much does this transfer?

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So they did the same things with a form of instruction tuning through rejection sampling, which is essentially heavily filtered instruction tuning with some reward models. And then they did this RLHF, but they made it math-heavy. So some of this transfer, we looked at this philosophical example early on, one of the big open questions is how much does this transfer?

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So they did the same things with a form of instruction tuning through rejection sampling, which is essentially heavily filtered instruction tuning with some reward models. And then they did this RLHF, but they made it math-heavy. So some of this transfer, we looked at this philosophical example early on, one of the big open questions is how much does this transfer?

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

If we bring in domains after the reasoning training, are all the models going to become eloquent writers by reasoning? Is this philosophy stuff going to be open? We don't know in the research of how much this will transfer. There's other things about how we can make soft verifiers and things like this. But there is more training after reasoning, which makes it easier to use these reasoning models.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

If we bring in domains after the reasoning training, are all the models going to become eloquent writers by reasoning? Is this philosophy stuff going to be open? We don't know in the research of how much this will transfer. There's other things about how we can make soft verifiers and things like this. But there is more training after reasoning, which makes it easier to use these reasoning models.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

If we bring in domains after the reasoning training, are all the models going to become eloquent writers by reasoning? Is this philosophy stuff going to be open? We don't know in the research of how much this will transfer. There's other things about how we can make soft verifiers and things like this. But there is more training after reasoning, which makes it easier to use these reasoning models.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And that's what we're using right now. So we're going to talk about with 3Mini and O1. These have gone through these extra techniques that are designed for human preferences after being trained to elicit reasoning.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And that's what we're using right now. So we're going to talk about with 3Mini and O1. These have gone through these extra techniques that are designed for human preferences after being trained to elicit reasoning.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And that's what we're using right now. So we're going to talk about with 3Mini and O1. These have gone through these extra techniques that are designed for human preferences after being trained to elicit reasoning.