David Duvenaud
๐ค SpeakerAppearances Over Time
Podcast Appearances
And they're sort of like affording many of the same dangers in the long run.
And so they're trying to say, okay, well, yeah, let's try to bind.
Let's try to like come up with some creative mechanism to bind ourselves to only do the good thing.
It's not clear what that mechanism could look like though.
Absolutely, yeah.
Yeah, I guess one of the other things that we're trying to flesh out is actually the details of, wait, wait, wait, how do we actually lose control if we have the aligned AGIs?
And I've tried to make this pitch, but obviously we've just started thinking about this.
And so actually me and my co-authors have been working with a math scholar, Gideon Fuderman, who's been trying to
make a more detailed case of like, here are all the avenues by which we're in a situation where we actually could shut down the AI or modify it or whatever, but we end up building institutions that don't serve us.
And so trying to flesh that out, I feel like there should be a lot more work in that direction.
Yeah, so I think it's basically beyond their scope to address, but it's not beyond their scope to monitor and sort of like help us understand what's happening.
So, I mean, I really like the Anthropic Economic Index or like where they're trying to say, like, what jobs are people actually doing and how are these models being used?
And I think more of that and like for more companies and more expansive is going to actually help people understand what's happening and just these dynamics a little bit better.
I will say that all the people that I talk to, in general, people are sympathetic to this.
I mean, some people are kind of a bit head in the sand or dismissive, but I think a lot of people there are just like, oh yeah, that's a huge problem.
It's not really clear how a single company can deal with it.
And they end up doing this thing where the RSP addresses or the...
whatever steeply commitments address these like very acute catastrophic risks.
And then there's just a whole bunch of slower sort of more systemic ways that things goes wrong that it's not clear what, like if everyone's getting like AI girlfriends and boyfriends, like how are they supposed to address that?
Right.