Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Aman Sanger

๐Ÿ‘ค Speaker
1050 total appearances

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Yeah. Verification, it feels like it's best when you know for a fact that it's correct. And then it wouldn't be using a language model to verify. It would be using tests or formal systems. Or running the thing, too.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Yeah. Verification, it feels like it's best when you know for a fact that it's correct. And then it wouldn't be using a language model to verify. It would be using tests or formal systems. Or running the thing, too.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Yeah, yeah.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Yeah, yeah.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Yeah, yeah.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

I think that that's the category that is, um, most likely to result in like massive gains.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

I think that that's the category that is, um, most likely to result in like massive gains.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

I think that that's the category that is, um, most likely to result in like massive gains.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Yeah, so RLHF is when the reward model you use is trained from some labels you've collected from humans giving feedback. I think this works if you have the ability to get a ton of human feedback for this kind of task that you care about.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Yeah, so RLHF is when the reward model you use is trained from some labels you've collected from humans giving feedback. I think this works if you have the ability to get a ton of human feedback for this kind of task that you care about.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Yeah, so RLHF is when the reward model you use is trained from some labels you've collected from humans giving feedback. I think this works if you have the ability to get a ton of human feedback for this kind of task that you care about.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

RLAIF is interesting because you're kind of depending on, like this is actually kind of going to, it's depending on the constraint that verification is actually a decent bit easier than generation. Because it feels like, okay, what are you doing? Are you using this language model to look at the language model outputs and then prove the language model?

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

RLAIF is interesting because you're kind of depending on, like this is actually kind of going to, it's depending on the constraint that verification is actually a decent bit easier than generation. Because it feels like, okay, what are you doing? Are you using this language model to look at the language model outputs and then prove the language model?

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

RLAIF is interesting because you're kind of depending on, like this is actually kind of going to, it's depending on the constraint that verification is actually a decent bit easier than generation. Because it feels like, okay, what are you doing? Are you using this language model to look at the language model outputs and then prove the language model?

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

But no, it actually may work if the language model has a much easier time verifying some solution than it does generating it. Then you actually could perhaps get this kind of recursive loop. I don't think it's going to look exactly like that. The other thing you could do is... we kind of do is a little bit of a mix of RLA-IF and RLA-HF, where usually the model is actually quite correct.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

But no, it actually may work if the language model has a much easier time verifying some solution than it does generating it. Then you actually could perhaps get this kind of recursive loop. I don't think it's going to look exactly like that. The other thing you could do is... we kind of do is a little bit of a mix of RLA-IF and RLA-HF, where usually the model is actually quite correct.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

But no, it actually may work if the language model has a much easier time verifying some solution than it does generating it. Then you actually could perhaps get this kind of recursive loop. I don't think it's going to look exactly like that. The other thing you could do is... we kind of do is a little bit of a mix of RLA-IF and RLA-HF, where usually the model is actually quite correct.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

And this is in the case of CursorTab, picking between two possible generations of what is the better one. And then it just needs a little bit of human nudging with only on the order of 50, 100 examples to kind of align that prior the model has with exactly what you want. It looks different than I think normal RLHF where you're usually training these reward models on tons of examples.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

And this is in the case of CursorTab, picking between two possible generations of what is the better one. And then it just needs a little bit of human nudging with only on the order of 50, 100 examples to kind of align that prior the model has with exactly what you want. It looks different than I think normal RLHF where you're usually training these reward models on tons of examples.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

And this is in the case of CursorTab, picking between two possible generations of what is the better one. And then it just needs a little bit of human nudging with only on the order of 50, 100 examples to kind of align that prior the model has with exactly what you want. It looks different than I think normal RLHF where you're usually training these reward models on tons of examples.