Aman Sanger
👤 PersonAppearances Over Time
Podcast Appearances
And then you can use that to then train, use a synthetic data to train a model that can be really good at detecting bugs. The last category, I think, is, I guess, the main one that it feels like the big labs are doing for synthetic data, which is... producing text with language models that can then be verified easily.
And then you can use that to then train, use a synthetic data to train a model that can be really good at detecting bugs. The last category, I think, is, I guess, the main one that it feels like the big labs are doing for synthetic data, which is... producing text with language models that can then be verified easily.
So like, you know, extreme example of this is if you have a verification system that can detect if language is Shakespeare level and then you have a bunch of monkeys typing in typewriters, like you can eventually get enough training data to train a Shakespeare level language model.
So like, you know, extreme example of this is if you have a verification system that can detect if language is Shakespeare level and then you have a bunch of monkeys typing in typewriters, like you can eventually get enough training data to train a Shakespeare level language model.
So like, you know, extreme example of this is if you have a verification system that can detect if language is Shakespeare level and then you have a bunch of monkeys typing in typewriters, like you can eventually get enough training data to train a Shakespeare level language model.
And I mean, this is the case, like very much the case for math where verification is, is, is actually really, really easy for formal, um, formal languages, and then what you can do is you can have an okay model, generate a ton of rollouts, and then choose the ones that you know have actually proved the ground truth theorems and then train that further.
And I mean, this is the case, like very much the case for math where verification is, is, is actually really, really easy for formal, um, formal languages, and then what you can do is you can have an okay model, generate a ton of rollouts, and then choose the ones that you know have actually proved the ground truth theorems and then train that further.
And I mean, this is the case, like very much the case for math where verification is, is, is actually really, really easy for formal, um, formal languages, and then what you can do is you can have an okay model, generate a ton of rollouts, and then choose the ones that you know have actually proved the ground truth theorems and then train that further.
There's similar things you can do for code with leetcode-like problems, where if you have some set of tests that you know correspond to, if something passes these tests, it has actually solved the problem. You can do the same thing where you verify that it's passed the test and then train the model and the output set of passed the tests.
There's similar things you can do for code with leetcode-like problems, where if you have some set of tests that you know correspond to, if something passes these tests, it has actually solved the problem. You can do the same thing where you verify that it's passed the test and then train the model and the output set of passed the tests.
There's similar things you can do for code with leetcode-like problems, where if you have some set of tests that you know correspond to, if something passes these tests, it has actually solved the problem. You can do the same thing where you verify that it's passed the test and then train the model and the output set of passed the tests.
I think it's gonna be a little tricky getting this to work in all domains or just in general. Like having the perfect verifier feels really, really hard to do with just like open-ended miscellaneous tasks you give the model or more like long horizon tasks, even in coding.
I think it's gonna be a little tricky getting this to work in all domains or just in general. Like having the perfect verifier feels really, really hard to do with just like open-ended miscellaneous tasks you give the model or more like long horizon tasks, even in coding.
I think it's gonna be a little tricky getting this to work in all domains or just in general. Like having the perfect verifier feels really, really hard to do with just like open-ended miscellaneous tasks you give the model or more like long horizon tasks, even in coding.
Yeah. Verification, it feels like it's best when you know for a fact that it's correct. And then it wouldn't be using a language model to verify. It would be using tests or formal systems. Or running the thing, too.
Yeah. Verification, it feels like it's best when you know for a fact that it's correct. And then it wouldn't be using a language model to verify. It would be using tests or formal systems. Or running the thing, too.
Yeah. Verification, it feels like it's best when you know for a fact that it's correct. And then it wouldn't be using a language model to verify. It would be using tests or formal systems. Or running the thing, too.
Yeah, yeah.
Yeah, yeah.
Yeah, yeah.