Nick Heiner
๐ค SpeakerAppearances Over Time
Podcast Appearances
So if it's way too hard, there's nothing for the model to learn.
Like, because it can't even, you know, to take back to the golfing example, um,
Let's say that you're on like some, I don't know, this is like the limits of my golf knowledge, but I feel like on Pebble Beach, like there's some part where like you have the option to hit the ball like totally over the water.
And let's say that you're not, like you're a total beginner golfer and nothing you do gets over the water.
Like you're just, it's, you can try a hundred times.
You're not going to get it once.
You're not learning from that because you never have the successful attempt to say, oh, that was the thing I needed to do.
So much the same way.
If it's too hard, you're not going to learn.
And if it's too easy, you're not going to learn because you're not pushing yourself.
So we have to test it on real models today and calibrate it.
And then as the models evolve, we need to evolve with them and continue to make the environments harder.
And the way that we test all this is by doing our own training runs.
So we train with open weights models and we see the improvements that our environments have.
And then...
You know, what you really want to see is not just that improves on your own environment, but that it improves on other benchmarks as well.
So that there's like some transferability of like we're teaching general skills and not just like some niche thing that maybe isn't realistic.
Yeah.
So I think you can get quite far without anything in the physical world.
Cool.