Grant Harvey
π€ SpeakerVoice Profile Active
This person's voice can be automatically recognized across podcast episodes using AI voice matching.
Appearances Over Time
Podcast Appearances
Yeah.
That makes sense.
What about harnesses, I guess?
Because we were hearing a lot about harnesses in the context of agents, and we're wondering... And benchmarking.
Yeah, if it's worth companies building their own versus something off the shelf.
I mean, do you have any insight into that and your perspective there?
So why are eval sets so hard to create?
I was going to say, you should hire me then because I feel like I'm a very lazy, clever person.
So I feel like I would try to find the easiest way to do stuff.
Technically it passes, yeah.
It did what you told it.
I know which one you're talking about.
I like that.
What's your recommendation for someone who, you know, needs is maybe starting to build their own evals and assess things like what what's what do you do or what's your perspective on the best way to eval or write evals, I guess?
Yeah, I love it.
Guilty of that for sure.
Oh, well, this leads me to ask another question, which is, do RL environments eventually replace benchmarks or like in terms of agentic settings?
Like, what's your take there?
So you're benchmarking it and you're saying, hey, this is what we're seeing and this is where you really need some help.
And then that's where you kind of... You need some law and some creativity.