Corey Knowles
👤 SpeakerVoice Profile Active
This person's voice can be automatically recognized across podcast episodes using AI voice matching.
Appearances Over Time
Podcast Appearances
That's a mouthful.
Are there things that you're doing in an effort to get the model to show more of its work?
You can't assess what you can't measure.
Yeah, right.
Yeah.
What are the warning signs maybe that a system's reasoning is becoming less monitorable over time?
How would is there a way to recognize that that's happening?
I guess that makes sense.
You tested whether, something that interested me was that you tested whether penalizing bad thoughts during training would stop that kind of misbehavior.
And I wondered if you could tell us a little about what you found there.
It's not a flawless process.
Yeah.
That makes sense.
I have a question I've wanted to ask a safety researcher for quite some time.
And it is, does the concept of open source AI concern you in any way as someone who's focused on safety and vulnerability?
And why or why not?
That's a thing I have wondered for so long, and I really appreciate you sharing your personal opinion there.
Because I always thought, like, at its core, I try to be a good open source promoter.
And at the same time, I'm like, you know, you kind of lose the button, though.
You don't have the off switch anymore.