Dylan Patel

👤 Speaker

See mentions of this person in podcasts

3551 total appearances

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

We don't yet have the equivalent of turn 37, which is the famous turn where the DeepMind's AI playing ghost dumped Lee Sedol completely. We don't have something that's that level of focal point, but that doesn't mean that the approach to technology is different and the impact of the general training. It's still incredibly new. What do you think that point would be?

10688.854 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

What would be move 37 for chain of thought, for reasoning? scientific discovery. When you use this sort of reasoning problem and it's just something we fully don't expect.

10708.474 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

What would be move 37 for chain of thought, for reasoning? scientific discovery. When you use this sort of reasoning problem and it's just something we fully don't expect.

10708.474 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

What would be move 37 for chain of thought, for reasoning? scientific discovery. When you use this sort of reasoning problem and it's just something we fully don't expect.

10708.474 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

All math and code benchmarks were pretty much solved, except for frontier math, which is designed to be almost questions that aren't practical to most people. Because they're exam-level, open math problem-type things. So it's on the math problems that are somewhat reasonable, which is somewhat complicated word problems or coding problems. It's just what Dylan is saying.

10805.683 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

All math and code benchmarks were pretty much solved, except for frontier math, which is designed to be almost questions that aren't practical to most people. Because they're exam-level, open math problem-type things. So it's on the math problems that are somewhat reasonable, which is somewhat complicated word problems or coding problems. It's just what Dylan is saying.

10805.683 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

All math and code benchmarks were pretty much solved, except for frontier math, which is designed to be almost questions that aren't practical to most people. Because they're exam-level, open math problem-type things. So it's on the math problems that are somewhat reasonable, which is somewhat complicated word problems or coding problems. It's just what Dylan is saying.

10805.683 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

The bank account can't lie. Exactly. There's surprising evidence that once you set up the ways of collecting the verifiable domain that this can work. There's been a lot of research before this R1 on math problems, and they approach math with language models just by increasing the number of samples. So you can just try again and again and again. And you look at the...

11021.648 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

The bank account can't lie. Exactly. There's surprising evidence that once you set up the ways of collecting the verifiable domain that this can work. There's been a lot of research before this R1 on math problems, and they approach math with language models just by increasing the number of samples. So you can just try again and again and again. And you look at the...

11021.648 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

The bank account can't lie. Exactly. There's surprising evidence that once you set up the ways of collecting the verifiable domain that this can work. There's been a lot of research before this R1 on math problems, and they approach math with language models just by increasing the number of samples. So you can just try again and again and again. And you look at the...

11021.648 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

amount of times that the language models get it right. And what we see is that even very bad models get it right sometimes. And the whole idea behind reinforcement learning is that you can learn from very sparse rewards.

11042.981 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

amount of times that the language models get it right. And what we see is that even very bad models get it right sometimes. And the whole idea behind reinforcement learning is that you can learn from very sparse rewards.

11042.981 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

amount of times that the language models get it right. And what we see is that even very bad models get it right sometimes. And the whole idea behind reinforcement learning is that you can learn from very sparse rewards.

11042.981 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So it doesn't... The space of language and the space of tokens, whether you're generating language or tasks for a robot, is so big that you might say that it's like... I mean, the tokenizer for a language model can be like 200,000 things. So at each step, it can sample from that big of a space. So if it... can generate a bit of a signal that it can climb onto.

11054.931 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So it doesn't... The space of language and the space of tokens, whether you're generating language or tasks for a robot, is so big that you might say that it's like... I mean, the tokenizer for a language model can be like 200,000 things. So at each step, it can sample from that big of a space. So if it... can generate a bit of a signal that it can climb onto.

11054.931 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So it doesn't... The space of language and the space of tokens, whether you're generating language or tasks for a robot, is so big that you might say that it's like... I mean, the tokenizer for a language model can be like 200,000 things. So at each step, it can sample from that big of a space. So if it... can generate a bit of a signal that it can climb onto.

11054.931 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

That's what the whole field of RL is around is learning from sparse rewards. And the same thing has played out in math where it's like very weak models that sometimes generate answers where you see research already that you can boost their math scores. You can do this sort of RL training

11074.347 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

That's what the whole field of RL is around is learning from sparse rewards. And the same thing has played out in math where it's like very weak models that sometimes generate answers where you see research already that you can boost their math scores. You can do this sort of RL training

11074.347 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

That's what the whole field of RL is around is learning from sparse rewards. And the same thing has played out in math where it's like very weak models that sometimes generate answers where you see research already that you can boost their math scores. You can do this sort of RL training

11074.347 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

For math, it might not be as effective, but if you take a 1 billion parameter model, so something 600 times smaller than DeepSeq, you can boost its grade school math scores very directly with a small amount of this training. So it's not to say that this is coming soon.

11089.867 View full episode →

← Previous Page 136 of 178 Next →

Report any issue