Rob Wiblin
๐ค SpeakerAppearances Over Time
Podcast Appearances
Yeah, or like I think it's a bit nebulous what the standard is that I hold myself to.
But I think for my research projects, when I think about timelines or I think about how AI could lead to takeover or like how quickly could the world change if we had AGI, I think I can often with like months of effort get to the point where I can say,
anticipate and have a reasonable response to and a reasonable back and forth with a very wide range of intelligent criticisms for why my conclusion might be totally wrong and totally off base.
I feel like I know what the skeptics that are more doomy than me will say, and I know what the skeptics that are less doomy than me will say, and I could have an intelligent conversation that goes for a long while with either side.
And that is a standard I aspired to get to with why we supported certain grants.
And I could do that with some of our grants, but I wanted the program to get to the point where if somebody came to me and said, isn't interpretability just actually hasn't seen much success over the last four years?
What do you make of that?
I wanted to have...
I wanted to kind of be at reflective equilibrium on my answers to questions like that and wanted to be able to say something that went a bit beyond like, yes, but outside view, we should support a range of things.
And that is something that I think emotionally is unsatisfying to me if it's a big element of my work.
I think it just takes a long time.
I think there's two things.
It just takes a lot of effort.
And then the other thing is that even if you put in that effort, you don't want to fully back your own inside view.
And then I think I wouldn't endorse that either.
And so like...
It's this one-two punch where it's just developing your views about exactly how interpretability or adversarial robustness or control or corrigibility fits into everything is a ton of work.
You have to talk to a ton of people.
You have to write up a bunch of stuff.
And in the meantime, you're not getting money out the door while you're doing all this stuff, right?