Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Dylan Patel

๐Ÿ‘ค Speaker
See mentions of this person in podcasts
3551 total appearances

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

It didn't necessarily be... There's things called system prompts, which are when you're querying a model, it's a piece of text that is shown to the model, but not to the user. So a fun example is your system prompt could be talk like a pirate. So no matter what the user says to the model, it'll respond like a pirate. In practice, what they are is... You are a helpful assistant.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

It didn't necessarily be... There's things called system prompts, which are when you're querying a model, it's a piece of text that is shown to the model, but not to the user. So a fun example is your system prompt could be talk like a pirate. So no matter what the user says to the model, it'll respond like a pirate. In practice, what they are is... You are a helpful assistant.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

It didn't necessarily be... There's things called system prompts, which are when you're querying a model, it's a piece of text that is shown to the model, but not to the user. So a fun example is your system prompt could be talk like a pirate. So no matter what the user says to the model, it'll respond like a pirate. In practice, what they are is... You are a helpful assistant.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

You should break down problems. If you don't know about something, don't tell them. Your date cut off is this. Today's date is this. It's a lot of really useful context for how can you answer a question well.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

You should break down problems. If you don't know about something, don't tell them. Your date cut off is this. Today's date is this. It's a lot of really useful context for how can you answer a question well.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

You should break down problems. If you don't know about something, don't tell them. Your date cut off is this. Today's date is this. It's a lot of really useful context for how can you answer a question well.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yes, which I think is great. And there's a lot of research that goes into this and One of your previous guests, Amanda Askell, is probably the most knowledgeable person, at least in the combination of execution and sharing. She's the person that should talk about system prompts and character of models.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yes, which I think is great. And there's a lot of research that goes into this and One of your previous guests, Amanda Askell, is probably the most knowledgeable person, at least in the combination of execution and sharing. She's the person that should talk about system prompts and character of models.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yes, which I think is great. And there's a lot of research that goes into this and One of your previous guests, Amanda Askell, is probably the most knowledgeable person, at least in the combination of execution and sharing. She's the person that should talk about system prompts and character of models.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And you could use this for bad things. We've done tests, which is, what if I tell the model to be a dumb model? Which evaluation scores go down? And it's like, we'll have this behavior where it could sometimes say, oh, I'm supposed to be dumb.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And you could use this for bad things. We've done tests, which is, what if I tell the model to be a dumb model? Which evaluation scores go down? And it's like, we'll have this behavior where it could sometimes say, oh, I'm supposed to be dumb.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And you could use this for bad things. We've done tests, which is, what if I tell the model to be a dumb model? Which evaluation scores go down? And it's like, we'll have this behavior where it could sometimes say, oh, I'm supposed to be dumb.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And sometimes it doesn't affect math abilities as much, but something like, if you're trying, it's just the quality of a human judgment would drop to the floors. Let's go back to post-training, specifically RLHF around Lama 2. Too much safety prioritization was baked into the model weights. This makes you refuse things in a really annoying way for users. It's not great. It caused a lot of...

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And sometimes it doesn't affect math abilities as much, but something like, if you're trying, it's just the quality of a human judgment would drop to the floors. Let's go back to post-training, specifically RLHF around Lama 2. Too much safety prioritization was baked into the model weights. This makes you refuse things in a really annoying way for users. It's not great. It caused a lot of...

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And sometimes it doesn't affect math abilities as much, but something like, if you're trying, it's just the quality of a human judgment would drop to the floors. Let's go back to post-training, specifically RLHF around Lama 2. Too much safety prioritization was baked into the model weights. This makes you refuse things in a really annoying way for users. It's not great. It caused a lot of...

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

um like awareness to be attached to rlhf that it makes the models dumb and it stigmatized the word it did in ai culture and as the techniques have evolved that's no longer the case where all these labs have very fine-grained control over what they get out of the models through techniques like rlhf although although different labs are definitely different levels like on the on one end of the spectrum is google

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

um like awareness to be attached to rlhf that it makes the models dumb and it stigmatized the word it did in ai culture and as the techniques have evolved that's no longer the case where all these labs have very fine-grained control over what they get out of the models through techniques like rlhf although although different labs are definitely different levels like on the on one end of the spectrum is google

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

um like awareness to be attached to rlhf that it makes the models dumb and it stigmatized the word it did in ai culture and as the techniques have evolved that's no longer the case where all these labs have very fine-grained control over what they get out of the models through techniques like rlhf although although different labs are definitely different levels like on the on one end of the spectrum is google

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And the important thing to say is that no matter how you want the model to behave, these RLHF and preference tuning techniques also improve performance. So on things like math evals and code evals, there is something innate to these what is called contrastive loss functions. We could start to get into RL here.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And the important thing to say is that no matter how you want the model to behave, these RLHF and preference tuning techniques also improve performance. So on things like math evals and code evals, there is something innate to these what is called contrastive loss functions. We could start to get into RL here.