Dario Amodei
๐ค SpeakerAppearances Over Time
Podcast Appearances
That's one of the things that has made it easier
Right now, separate from that, you know, there's a research area called continual learning, which is where these agents would kind of learn during time, learn on the job.
And obviously that has a bunch of advantages.
Some people think it's one of the most important barriers to making these more human-like, but that would introduce all these new alignment problems.
So I'm actually a skeptic that continual learning is necessary.
We don't know yet, but is necessarily needed.
Like maybe there's a world where the way we make these AI systems safe is by not having them do continual learning.
Right.
Again, you know, if we go back to the laws and international treaties, like if you have some barrier that's like we're going to take this path, but we're not going to take that path.
I still have a lot of skepticism, but like that's the kind of thing that like at least doesn't seem dead on arrival.
What the hell is that?
It's actually almost exactly what it sounds like.
So basically, the constitution is a document readable by humans.
Ours is about 75 pages long.
And
As we're training Claude, as we're training the AI system, in some large fraction of the tasks we give it, we say, please do this task in line with this constitution, in line with this document.
And then so every time Claude does a task, it kind of like reads the constitution.
And so as it's training, every loop of its training, it looks at that constitution and keeps it in mind.
And so over time...
And then we have Claude itself, or another copy of Claude, evaluate, hey, did what Claude just do in line with the Constitution?