Aman Sanger

👤 Speaker

1050 total appearances

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast

#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Like one interesting proof of concept for the learning this knowledge directly in the weights is with VS Code. So we're in a VS Code fork and VS Code, the code is all public. So these models in pre-training have seen all the code. They've probably also seen questions and answers about it, and then they've been fine-tuned and RLHFed to be able to answer questions about code in general.

6858.905 View full episode →

Lex Fridman Podcast

#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Like one interesting proof of concept for the learning this knowledge directly in the weights is with VS Code. So we're in a VS Code fork and VS Code, the code is all public. So these models in pre-training have seen all the code. They've probably also seen questions and answers about it, and then they've been fine-tuned and RLHFed to be able to answer questions about code in general.

6858.905 View full episode →

Lex Fridman Podcast

#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

So when you ask it a question about VS Code, sometimes it'll hallucinate, but sometimes it actually does a pretty good job at answering the question. And I think like this is just by, it happens to be okay at it. But what if you could actually like specifically train or post train a model such that it really was built to understand this code base?

6882.508 View full episode →

Lex Fridman Podcast

#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

So when you ask it a question about VS Code, sometimes it'll hallucinate, but sometimes it actually does a pretty good job at answering the question. And I think like this is just by, it happens to be okay at it. But what if you could actually like specifically train or post train a model such that it really was built to understand this code base?

6882.508 View full episode →

Lex Fridman Podcast

#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

So when you ask it a question about VS Code, sometimes it'll hallucinate, but sometimes it actually does a pretty good job at answering the question. And I think like this is just by, it happens to be okay at it. But what if you could actually like specifically train or post train a model such that it really was built to understand this code base?

6882.508 View full episode →

Lex Fridman Podcast

#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

It's an open research question, one that we're quite interested in. And then there's also uncertainty of like, do you want the model to be the thing that end to end is doing everything, i.e. it's doing the retrieval and its internals and then kind of answering the question, creating the code?

6905.151 View full episode →

Lex Fridman Podcast

#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

It's an open research question, one that we're quite interested in. And then there's also uncertainty of like, do you want the model to be the thing that end to end is doing everything, i.e. it's doing the retrieval and its internals and then kind of answering the question, creating the code?

6905.151 View full episode →

Lex Fridman Podcast

#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

It's an open research question, one that we're quite interested in. And then there's also uncertainty of like, do you want the model to be the thing that end to end is doing everything, i.e. it's doing the retrieval and its internals and then kind of answering the question, creating the code?

6905.151 View full episode →

Lex Fridman Podcast

#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Or do you want to separate the retrieval from the frontier model where maybe, you know, you'll get some really capable models that are much better than like the best open source ones in a handful of months? Yeah. And then you'll want to separately train a really good open source model to be the retriever, to be the thing that feeds in the context to these larger models.

6918.236 View full episode →

Lex Fridman Podcast

#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Or do you want to separate the retrieval from the frontier model where maybe, you know, you'll get some really capable models that are much better than like the best open source ones in a handful of months? Yeah. And then you'll want to separately train a really good open source model to be the retriever, to be the thing that feeds in the context to these larger models.

6918.236 View full episode →

Lex Fridman Podcast

#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Or do you want to separate the retrieval from the frontier model where maybe, you know, you'll get some really capable models that are much better than like the best open source ones in a handful of months? Yeah. And then you'll want to separately train a really good open source model to be the retriever, to be the thing that feeds in the context to these larger models.

6918.236 View full episode →

Lex Fridman Podcast

#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Is this... Yeah, I mean, there are many possible ways you could try doing it. There's certainly no shortage of ideas. It's just a question of going in and trying all of them and being empirical about which one works best. One very naive thing is to try to replicate what's done with VS Code and these frontier models.

6948.751 View full episode →

Lex Fridman Podcast

#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Is this... Yeah, I mean, there are many possible ways you could try doing it. There's certainly no shortage of ideas. It's just a question of going in and trying all of them and being empirical about which one works best. One very naive thing is to try to replicate what's done with VS Code and these frontier models.

6948.751 View full episode →

Lex Fridman Podcast

#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Is this... Yeah, I mean, there are many possible ways you could try doing it. There's certainly no shortage of ideas. It's just a question of going in and trying all of them and being empirical about which one works best. One very naive thing is to try to replicate what's done with VS Code and these frontier models.

6948.751 View full episode →

Lex Fridman Podcast

#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

So let's continue pre-training, some kind of continued pre-training that includes general code data, but also throws in a lot of the data of some particular repository that you care about.

6969.429 View full episode →

Lex Fridman Podcast

#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

So let's continue pre-training, some kind of continued pre-training that includes general code data, but also throws in a lot of the data of some particular repository that you care about.

6969.429 View full episode →

Lex Fridman Podcast

#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

So let's continue pre-training, some kind of continued pre-training that includes general code data, but also throws in a lot of the data of some particular repository that you care about.

6969.429 View full episode →

Lex Fridman Podcast

#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

And then in post-training, meaning in, let's just start with instruction fine-tuning, you have like a normal instruction fine-tuning data set about code, but you throw in a lot of questions about code in that repository. So you could either get ground truth ones, which might be difficult, or you could do what you kind of hinted at or suggested using synthetic data, i.e.,

6979.498 View full episode →

Lex Fridman Podcast

#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

And then in post-training, meaning in, let's just start with instruction fine-tuning, you have like a normal instruction fine-tuning data set about code, but you throw in a lot of questions about code in that repository. So you could either get ground truth ones, which might be difficult, or you could do what you kind of hinted at or suggested using synthetic data, i.e.,

6979.498 View full episode →

Lex Fridman Podcast

#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

And then in post-training, meaning in, let's just start with instruction fine-tuning, you have like a normal instruction fine-tuning data set about code, but you throw in a lot of questions about code in that repository. So you could either get ground truth ones, which might be difficult, or you could do what you kind of hinted at or suggested using synthetic data, i.e.,

6979.498 View full episode →

← Previous Page 42 of 53 Next →

Report any issue