Joe Allen
๐ค SpeakerAppearances Over Time
Podcast Appearances
You're basically a schoolmarm overseeing the model.
How does it look like?
What are you guys doing?
Is it just rows of cubicles and people hammering away, torturing these models to death?
What's going on?
And you're just systematically kind of creating D&D experiences for the models, right?
You guys are the dungeon masters.
The models are the players.
and you're walking them through a kind of choose your own adventure scenario.
so the problem there goes you know well beyond anything like people talk about super intelligence having a kind of true will of its own and being able to overtake humanity but before any of that were to come to pass you already have the problem of a human being simply using one of these models to generate code to hack yes yes we're already there yes and i think it's important to know that we we currently have a sort of a very weird kind of thing
And so that goes back to the other problem that we have, which is replacement.
Simply the replacement of first coders and then accountants, lawyers, maybe even doctors, at least in the case of telemedicine.
And we already see that.
I know of a number of, I would say, cynical and nefarious companies that use AI for diagnosis in telemedicine, which...
is beyond unethical.
But anyway, that's another question entirely.
All right, so going back to some of these evaluations and some of the strange behaviors that you see, I'd like to talk a bit about situational awareness or the idea that these models become, maybe conscious is the wrong word, but they exhibit
some sort of awareness that they're being tested during the testing and almost seem kind of appalled that they're being tested or put off by it.
Can you walk us through those studies, how they were conducted, and what the results were?
Meaning that they're testing it to see if it's going to try to do something, some covert behavior that isn't desired.