Grant Harvey
👤 SpeakerVoice Profile Active
This person's voice can be automatically recognized across podcast episodes using AI voice matching.
Appearances Over Time
Podcast Appearances
But if you were to use a language model in a robotic scenario, how would this play out in a similar way?
Yeah.
Perhaps that's not the right way to phrase that question, but do you know what I'm saying?
Right.
Maybe.
That's where I would worry.
Like, let's say like you're training a robot and the robot knows, oh, I need to go over there, but it doesn't know it can't flail its giant mechanical arms and smack everybody between here and there.
That's kind of what I was picturing.
Exactly.
Yeah.
Right.
That makes sense.
Well, how would this play out in practice if we deployed these systems at work?
For instance, you talked about obfuscated reward hacking.
So what happens when the models learn to hide their misbehavior?
You kind of gave an example in the healthcare scenario, but perhaps we could expand on that.
I do have a question.
So do you see this being something that the end user would be monitoring or is it something that you would monitor, you know, on the OpenAI side?
Right.
Because there's maybe been historical accounts of other companies using that information and then releasing products with it.