Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Dylan Patel

๐Ÿ‘ค Speaker
See mentions of this person in podcasts
3551 total appearances

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

We'll get back to chain of thought in a second, which looks like a lot of tokens where the model is explaining the problem. The model will often break down the problem and be like, okay, they asked me for this. Let's break down the problem. I'm going to need to do this. and you'll see all of this generating from the model. It'll come very fast in most user experiences.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

We'll get back to chain of thought in a second, which looks like a lot of tokens where the model is explaining the problem. The model will often break down the problem and be like, okay, they asked me for this. Let's break down the problem. I'm going to need to do this. and you'll see all of this generating from the model. It'll come very fast in most user experiences.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

We'll get back to chain of thought in a second, which looks like a lot of tokens where the model is explaining the problem. The model will often break down the problem and be like, okay, they asked me for this. Let's break down the problem. I'm going to need to do this. and you'll see all of this generating from the model. It'll come very fast in most user experiences.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

These APIs are very fast, so you'll see a lot of tokens, a lot of words show up really fast. It'll keep flowing on the screen, and this is all the reasoning process. And then eventually the model will change its tone in R1, and it'll write the answer, where it summarizes its reasoning process and writes a similar answer to the first types of model.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

These APIs are very fast, so you'll see a lot of tokens, a lot of words show up really fast. It'll keep flowing on the screen, and this is all the reasoning process. And then eventually the model will change its tone in R1, and it'll write the answer, where it summarizes its reasoning process and writes a similar answer to the first types of model.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

These APIs are very fast, so you'll see a lot of tokens, a lot of words show up really fast. It'll keep flowing on the screen, and this is all the reasoning process. And then eventually the model will change its tone in R1, and it'll write the answer, where it summarizes its reasoning process and writes a similar answer to the first types of model.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

But in DeepSeq's case, which is part of why this was so popular even outside the AI community, is that you can see how the language model is breaking down problems. And then you get this answer on a technical side.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

But in DeepSeq's case, which is part of why this was so popular even outside the AI community, is that you can see how the language model is breaking down problems. And then you get this answer on a technical side.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

But in DeepSeq's case, which is part of why this was so popular even outside the AI community, is that you can see how the language model is breaking down problems. And then you get this answer on a technical side.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

They train the model to do this specifically where they have a section, which is reasoning, and then it generates a special token, which is probably hidden from the user most of the time, which says, okay, I'm starting to answer. So the model is trained to do this two-stage process on its own. If you use a similar model in, say, OpenAI, OpenAI's user interface is...

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

They train the model to do this specifically where they have a section, which is reasoning, and then it generates a special token, which is probably hidden from the user most of the time, which says, okay, I'm starting to answer. So the model is trained to do this two-stage process on its own. If you use a similar model in, say, OpenAI, OpenAI's user interface is...

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

They train the model to do this specifically where they have a section, which is reasoning, and then it generates a special token, which is probably hidden from the user most of the time, which says, okay, I'm starting to answer. So the model is trained to do this two-stage process on its own. If you use a similar model in, say, OpenAI, OpenAI's user interface is...

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

trying to summarize this process for you nicely by kind of showing the sections that the model is doing. And it'll kind of click through, it'll say breaking down the problem, making X calculation, cleaning the result, and then the answer will come for something like OpenAI.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

trying to summarize this process for you nicely by kind of showing the sections that the model is doing. And it'll kind of click through, it'll say breaking down the problem, making X calculation, cleaning the result, and then the answer will come for something like OpenAI.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

trying to summarize this process for you nicely by kind of showing the sections that the model is doing. And it'll kind of click through, it'll say breaking down the problem, making X calculation, cleaning the result, and then the answer will come for something like OpenAI.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yeah, so if you're looking at the screen here, what you'll see is a screenshot of the DeepSea chat app. And at the top is thought for 151.7 seconds with the dropdown arrow. Underneath that, if we were in an app that we were running, the dropdown arrow would have the reasoning.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yeah, so if you're looking at the screen here, what you'll see is a screenshot of the DeepSea chat app. And at the top is thought for 151.7 seconds with the dropdown arrow. Underneath that, if we were in an app that we were running, the dropdown arrow would have the reasoning.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yeah, so if you're looking at the screen here, what you'll see is a screenshot of the DeepSea chat app. And at the top is thought for 151.7 seconds with the dropdown arrow. Underneath that, if we were in an app that we were running, the dropdown arrow would have the reasoning.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

It's going to have pages and pages of this. It's almost too much to actually read, but it's nice to skim as it's coming.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

It's going to have pages and pages of this. It's almost too much to actually read, but it's nice to skim as it's coming.