David Duvenaud
๐ค SpeakerAppearances Over Time
Podcast Appearances
So the idea is you train a model only on data up to 1930, then you ask it to predict the likelihood that it would give to a headline in 1940 or some other free-form text.
And you can evaluate their likelihoods on this text
in the past and then you can also use the same scaffolding on a model train up to 2025 and then ask it to predict like headlines in 2035 and get a rough idea of like or you can iterate on your scaffolding by seeing how well it does on like past data
So that's been the huge slap so far is like constantly finding different sources of unintentional data poisoning and like mislabeled data and things like that.
So, I mean, their elements can help you because there's sort of like a chicken and egg.
Like once you have an element that has a rough idea of like what sort of thing happened in what time, then when I see some like reference to like genetic engineering and like some like 1930s data, it's like, oh, that no one used that phrase at this point.
And then you can use that to like help clean the data more.
But it's like I think this is like an Achilles heel of this approach.
Yeah, it's also.
actually another technical problem of data poisoning just through the questions you ask.
So if you are just doing metaculous style, like is there going to be a war between India and Pakistan this year?
It's actually hard because when you tune your scaffolding to go back, most of the questions you ask about, you're asking because something happened, right?
So it's like, imagine a future person comes back and asks me if I'm worried about, I don't know, Lithuania invading Canada.
I'd be like, well, I wasn't until you asked me, right?
Yeah, so it's easy to sort of like unintentionally poison your, or rather incentivize your model to be the opposite of the nothing ever happens guy, to just be like, yes, whatever you're asking, like there was a 1% chance it happened.
How do you avoid that?
Well, so then, I mean, you try to, I guess I'll say that's one nice thing about the open-ended just generate text approach, because then you have to normalize over all possible newspaper headlines.
So that actually already guards against this sort of validation poisoning problem.
But then that has its own problem because the likelihood is very sensitive to styles.
Maybe there's a new nickname for the president in the future, and if one model guesses it or thinks it's plausible, another one doesn't, and that ends up dominating the likelihood.