Matt Freeman

anthropic is releasing the third version of its rsp reflecting on two plus years of experience key structural change separates what anthropic can realistically achieve on its own from what it believes the broader industry needs to do collectively acknowledging that higher safety levels may be impossible for one company to implement unilaterally basically what you said steven

3809.933 View full episode →

The Bayesian Conspiracy

257 – Pentagon Comes For Claude

introduces three new mechanisms, a published safety frontier safety roadmap with publicly graded goals, a periodic risk reports with external expert review, and an industry-wide capabilities to mitigations map, all aimed at increasing transparency and accountability.

3827.5 View full episode →

The Bayesian Conspiracy

257 – Pentagon Comes For Claude

So my interpretation, tell me if you think I'm being overly generous, is they basically said, hey, we're not going to hold to the RSP, but we're going to introduce three new transparency mechanisms that will at least hopefully give some feedback loop where other people can see where things are going and, you know, be mad at us if we're doing the wrong thing, something like that.

3842.502 View full episode →

The Bayesian Conspiracy

257 – Pentagon Comes For Claude

So just to be a little technical here on the RSP, you know,

3934.802 View full episode →

The Bayesian Conspiracy

257 – Pentagon Comes For Claude

It's fairly specific where basically they test, you know, any given model on purely on capabilities and they kind of see what it can do.

3938.673 View full episode →

The Bayesian Conspiracy

257 – Pentagon Comes For Claude

And they're like, you know, how useful is this as a tool for building chemical weapons, for building biological weapons, for doing, you know, high leverage, extremely damaging acts.

3948.186 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment