Grant Harvey
π€ SpeakerVoice Profile Active
This person's voice can be automatically recognized across podcast episodes using AI voice matching.
Appearances Over Time
Podcast Appearances
And then I guess just to close the loop on this, what is the difference between pre-training and post-training and reinforcement learning?
So what would the ultimate RL environment look like?
Would it be something like OpenClaw where the agent controls an entire computer?
Or would it be like an actual synthetic world, like a synthetic golf course, for example?
And you have something like Google's Genie.
Or is it robots navigating the real world?
And that's the ultimate environment.
What do you think?
Oh, wow.
Well, Surge builds RL environments like CoreCraft, right?
It's the simulated e-commerce company where AI agents work, customers support.
Why did you focus on this area?
And you released a benchmark to go with it as well.
Is that right?
That makes sense.
And which labs are using search right now?
Wow.
Yeah.
That's really interesting.
The longer running the task is.