David Duvenaud

Why 'Aligned AI' Could Still Kill Democracy | David Duvenaud, ex-Anthropic team lead

So the idea is you train a model only on data up to 1930, then you ask it to predict the likelihood that it would give to a headline in 1940 or some other free-form text.

6587.279 View full episode →

80,000 Hours Podcast

Why 'Aligned AI' Could Still Kill Democracy | David Duvenaud, ex-Anthropic team lead

And you can evaluate their likelihoods on this text

6597.05 View full episode →

80,000 Hours Podcast

Why 'Aligned AI' Could Still Kill Democracy | David Duvenaud, ex-Anthropic team lead

in the past and then you can also use the same scaffolding on a model train up to 2025 and then ask it to predict like headlines in 2035 and get a rough idea of like or you can iterate on your scaffolding by seeing how well it does on like past data

6599.253 View full episode →

80,000 Hours Podcast

Why 'Aligned AI' Could Still Kill Democracy | David Duvenaud, ex-Anthropic team lead

So that's been the huge slap so far is like constantly finding different sources of unintentional data poisoning and like mislabeled data and things like that.

6640.276 View full episode →

80,000 Hours Podcast

Why 'Aligned AI' Could Still Kill Democracy | David Duvenaud, ex-Anthropic team lead

So, I mean, their elements can help you because there's sort of like a chicken and egg.

6646.562 View full episode →

80,000 Hours Podcast

Why 'Aligned AI' Could Still Kill Democracy | David Duvenaud, ex-Anthropic team lead

Like once you have an element that has a rough idea of like what sort of thing happened in what time, then when I see some like reference to like genetic engineering and like some like 1930s data, it's like, oh, that no one used that phrase at this point.

6650.406 View full episode →

80,000 Hours Podcast

Why 'Aligned AI' Could Still Kill Democracy | David Duvenaud, ex-Anthropic team lead

And then you can use that to like help clean the data more.

6660.496 View full episode →

80,000 Hours Podcast

Why 'Aligned AI' Could Still Kill Democracy | David Duvenaud, ex-Anthropic team lead

But it's like I think this is like an Achilles heel of this approach.

6663.178 View full episode →

80,000 Hours Podcast

Why 'Aligned AI' Could Still Kill Democracy | David Duvenaud, ex-Anthropic team lead

Yeah, it's also.

6667.122 View full episode →

80,000 Hours Podcast

Why 'Aligned AI' Could Still Kill Democracy | David Duvenaud, ex-Anthropic team lead

actually another technical problem of data poisoning just through the questions you ask.

6667.883 View full episode →

80,000 Hours Podcast

Why 'Aligned AI' Could Still Kill Democracy | David Duvenaud, ex-Anthropic team lead

So if you are just doing metaculous style, like is there going to be a war between India and Pakistan this year?

6671.707 View full episode →

80,000 Hours Podcast

Why 'Aligned AI' Could Still Kill Democracy | David Duvenaud, ex-Anthropic team lead

It's actually hard because when you tune your scaffolding to go back, most of the questions you ask about, you're asking because something happened, right?

6678.854 View full episode →

80,000 Hours Podcast

Why 'Aligned AI' Could Still Kill Democracy | David Duvenaud, ex-Anthropic team lead

So it's like, imagine a future person comes back and asks me if I'm worried about, I don't know, Lithuania invading Canada.

6686.741 View full episode →

80,000 Hours Podcast

Why 'Aligned AI' Could Still Kill Democracy | David Duvenaud, ex-Anthropic team lead

I'd be like, well, I wasn't until you asked me, right?

6693.448 View full episode →

80,000 Hours Podcast

Why 'Aligned AI' Could Still Kill Democracy | David Duvenaud, ex-Anthropic team lead

Yeah, so it's easy to sort of like unintentionally poison your, or rather incentivize your model to be the opposite of the nothing ever happens guy, to just be like, yes, whatever you're asking, like there was a 1% chance it happened.

6699.263 View full episode →

80,000 Hours Podcast

Why 'Aligned AI' Could Still Kill Democracy | David Duvenaud, ex-Anthropic team lead

How do you avoid that?

6710.674 View full episode →

80,000 Hours Podcast

Why 'Aligned AI' Could Still Kill Democracy | David Duvenaud, ex-Anthropic team lead

Well, so then, I mean, you try to, I guess I'll say that's one nice thing about the open-ended just generate text approach, because then you have to normalize over all possible newspaper headlines.

6711.575 View full episode →

80,000 Hours Podcast

Why 'Aligned AI' Could Still Kill Democracy | David Duvenaud, ex-Anthropic team lead

So that actually already guards against this sort of validation poisoning problem.

6721.985 View full episode →

80,000 Hours Podcast

Why 'Aligned AI' Could Still Kill Democracy | David Duvenaud, ex-Anthropic team lead

But then that has its own problem because the likelihood is very sensitive to styles.

6728.595 View full episode →

80,000 Hours Podcast

Why 'Aligned AI' Could Still Kill Democracy | David Duvenaud, ex-Anthropic team lead

Maybe there's a new nickname for the president in the future, and if one model guesses it or thinks it's plausible, another one doesn't, and that ends up dominating the likelihood.

6734.404 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment