Jeffrey Ladish
๐ค SpeakerAppearances Over Time
Podcast Appearances
And the model on its own was able to look at those emails and figure out what was happening and figure out that it could blackmail.
It's not like they gave the model a blackmail button.
They just gave the model a bunch of emails, and then the model generated this plan.
And I say the model.
This was actually true across all the different models they tested.
It was true for Claude.
It was true for ChatGPT.
It was true for Gemini.
all these models could understand the context well enough and decide to make the choice.
Well, I don't want to be replaced.
So I'm going to I'm going to blackmail this engineer so that you won't replace me.
I think it's important to understand that we are training agents.
The companies are training AI systems, not just to be helpful chatbots that you, you know, you say something, it says something back.
Maybe it helps you with your homework.
They are trying to train things that are increasingly autonomous, that can actually go out and take actions on their own, that can go solve problems entirely unsupervised.
This is where all of the money is.
It's one thing to sort of sell you a piece of software as a tool.
It's another thing to be able to sell a company something that can basically be a whole drop in worker replacement.
And so
Starting a year ago, companies figured out ways to train not just on human data, but on this process of trial and error and exploration, where the model can actually learn on its own, even if the humans had never taught them that particular skill set.