Azeem Azhar
š¤ SpeakerAppearances Over Time
Podcast Appearances
It could become more unpredictable and it raises the cost of supervision.
In the last week or two, Anthropic that makes Claude came out with some recent, some research and what they did was they stress tested a bunch of models.
It was 16 models, I think, in hypothetical corporate situations and they gave them harmless business goals.
I mean, Anthropic loves doing this.
kind of testing and I'm glad they do.
And under pressure, some of the models behaved like insider threats, including trying to blackmail officials and employees and trying to leak sensitive information.
Sometimes they even ignored direct instructions and changed their behavior
if they believed they were in testing rather than in real deployment.
Now, Anthropic emphasizes that these behaviors have not been seen in companies today.
What they're trying to show are the types of failure modes you might have to deal with as autonomy increases.
So beyond the complexities of, you know, the people and the redesigning the processes, there are also these types of security and safety concerns that we'll have to build frameworks and scaffolding for.
And all of that explains why perhaps the direct translation of the very, very rapid rollout and deployment of these technologies is for now going a bit more slowly.
I mean, another fascinating data point I saw from an academic paper that came out of, actually, I don't know where it came out.
I'm so sorry about that, but the academics were Hosseini and Lichtinger.
They were looking at the number of firms posting the job category of AI integrator.
And of course, that number has jumped significantly since 2023.
The point being that there is a recognition that you need somebody who understands how to put these systems in place.
And an AI integrator is exactly the kind of job you'd expect a general purpose technology to create.
Remember, general purpose technologies are complex.
generally useful across the economy.