David Duvenaud
๐ค SpeakerAppearances Over Time
Podcast Appearances
So one initiative that's happening, one of my co-authors is Dayer, who is the CEO of Metaculous, and him and some other people are trying to make the gradual disempowerment index.
And I think there's just a lot of work that we can do in trying to operationalize these claims of, like, humans won't be able to advocate for their own interests or, like, this lever of power will be, like...
hard, like will be even more disconnected from human interest than it has been.
I think these are like very vague claims that and it's pretty hard to say because a lot of them are these are very hard to operationalize because you have to like define what it means for like a group to want something and talk about these counterfactuals.
So this is like a very hard problem.
But I think that's like some of the most basic groundwork that needs to be done at this point is like clarify what we're even talking about.
Yeah.
So actually, I had the exact same thought.
And that's why that leads me to one of the projects that I'm working on, like the actual technical projects that I'm working on, which is me and a few people, including Alec Radford, who's like one of the creators of GPT, who's now sort of like unemployed and just doing fun research projects, is trying to train a historical LLM, like a LLM that's only trained up on data up to like, let's say, 1930 and then like maybe 40, 1950.
And the idea being that
As you said, it's hard to operationalize these questions like, I don't know, what fraction of humans are employed?
It might not really matter or be the right question to ask.
What we'd rather ask is something more like, what is the future newspaper headline?
Or given a leader, what's their Wikipedia page or something like that?
It's more like freeform sort of things.
And the cool thing is that
LLMs, you can query them to predict this sort of thing, right?
Like, write me a newspaper headline from 2030 or whatever.
I mean, they're not going to do a good job unless they have a lot of scaffolding and specific training.
But we can validate that kind of scaffolding on historical data using these historical LLMs.