Nick Heiner
๐ค SpeakerAppearances Over Time
Podcast Appearances
Yeah, so it's similar to...
It's similar to the question of building our own environments in the first place.
Like if you're going to deploy a solution, you need an eval.
If you don't have an eval, then you're basically just going off vibes.
You're flying blind.
And if what you're building, if it matters, if it's correct, and if it has more than like a tiny surface area, there's just no way to like you make a change to the system because
And then you try it five times and you're like, okay, it's better now.
Like that's just, that's not going to be good enough for like these business use cases.
So you need to have some means of evaluating what you're doing.
And then once you have that, yeah, maybe you're using sort of a community agent harness.
A lot of them are very customizable.
If you want to be sophisticated, maybe you're using just a drop-in solution.
Like, you know, they're like drop-in customer support agents and,
I think depending on the company's degree of technical sophistication and how bespoke their problems are, you know, all of those things can make sense.
But the way that they know where they need to be is having that great eval set.
Because otherwise, yeah, you're just guessing.
So one big challenge is reward hacking, which is where models will cleverly find ways to get the reward signal out of your environment that gives it a high score without actually doing the thing you want them to do.
They sort of follow the letter, but not the spirit of the law.
And so, for instance, if you have ever tried to do behavior modification on a small child,
and you say something like, stop hitting your sister, and the child responds by kicking instead.