Andrew Ilyas
๐ค SpeakerAppearances Over Time
Podcast Appearances
And so I think it's a very open question what the right way to do it is.
I think the way that I'm partial to is to try to understand or distill some general principles that underlie the behavior of all machine learning models.
We're all machine learning models in a broad class.
And I think that's a very ambitious goal and it's unclear if it's even possible.
But I think increasingly it's been getting a lot of attention and I think that's good.
Yeah, absolutely.
So the idea is that I just talked about this pipeline.
If we forget for a second about the data collection process and we just assume that you have a data set, clearly changing that data set is going to change model predictions in some way.
And so what we were asking is, can we, without actually thinking about the very mechanistic details of the learning algorithm itself, can we sort of black box that away and think of machine learning as just a map directly from training data set to prediction?
And so what data modeling looks like is you hold a single prediction fixed.
So let's say you pick your favorite image in the test set and you're just interested in studying your model's prediction on that specific test example.
Your goal in data modeling is to predict as a function of which training data you train on, how your model is going to behave on that test example.
And so what that looks like is basically a surrogate supervised learning problem where your inputs are data sets and your outputs are model behavior on a specific test example.
Yeah, absolutely.
I think it's useful to maybe think even before the era of deep learning, there was this field of like influence functions and statistics that basically came from this need of being able to estimate parameters
had you weighted your data differently?
So people had these like what they called like M estimation tasks and they wanted to know like if I dropped out some of the samples or mostly it was a theoretical tool, but if you dropped out some of the samples, how would your estimator change?
And so conditioned on a bunch of, you know, quite conditioned on a bunch of pretty strict assumptions in statistics, we came up with like really rigorous tools with theoretical guarantees for answering exactly that kind of question, which, you know, looks a lot like the data modeling task actually.
And so there was this brilliant paper in 2017 from Peng Wei Ko and Percy Leung at Stanford that basically took these statistical tools and applied them directly to deep neural networks and found that it doesn't actually work so bad, even though, you know, in deep neural networks, like none of the assumptions that the stats people cared about were true.
And so, you know, that was a huge first step.