David Duvenaud
๐ค SpeakerAppearances Over Time
Podcast Appearances
This question doesn't really answer this question of like, do we feel like we've been hacked by our current standards or did we actually successfully build like even better beings that we're supporting?
Sure.
So I guess the different companies have different names for this.
Like I think OpenAI talks about the model spec and I thought I talked about constitutional AI and the idea being that a lot of the alignment that happens for the AI is you ask it to follow some principles like, you know, be nice to the user, don't do something illegal and practice like producing outcomes that follow this constitution.
This also, you know, the system prompt matters a lot.
It's just like what values are being loaded into the AI.
Yeah, and I mean, I think there was an executive order like a few weeks ago against like Woke AI, right?
And I think like people are already recognizing that this is an important cultural battleground.
I mean, part of the way that I got started on this whole journey was asking, you know, people at different labs, like, okay, so we build the Aligned AGI.
How does it end up that it's running the country or the world or something?
It seems like the government's going to ask us for the version of the AI that never refuses to like do one of its orders.
And we end up being forced one way or another to align the AI to the government.
I'm hoping not to hyperstition this into being, but I feel like it's just obvious that this is going to become one of the most important cultural and political battlegrounds that people fight over.
Yes, that's maybe the only time we'll expect to see actually user-aligned AIs is when people actually physically own them.
Right.
Or like Ireland famously competes on the low corporate tax rates.
Right.
So we could imagine like some future Ireland saying, oh, you get to write your own system prompt in our country or something like that.
So that is another reason for hope is that people will demand this kind of freedom.
But I mean, even the like Ireland of the future will probably have a bit in there like don't conspire against Ireland or whatever.