Andrew Ilyas
๐ค SpeakerAppearances Over Time
Podcast Appearances
published experiments from social sciences and stuff.
And they found that by removing just a couple of data points from these surveys analyses, they could flip the conclusions.
And so we basically were able to do the same thing in the context of deep learning by using data models.
So figuring out like which exact training examples do you need to drop to flip a model's prediction on a given test example.
So I'd say those were sort of all of the
original applications of data models.
I think the ones that I'm more excited about now, as I was mentioning earlier, the ones that I'm more excited about now are more of the flavor of you have some property that you want from your model.
And that property can be written as a function of the dataset that it's trained on as a function of model predictions.
And so for example, like, you know, dataset selection or even machine teaching.
But if you think about dataset selection, you can sort of write like, okay, I want, you know, max over dataset of model performance.
And so, you know, that's obviously a very expensive problem to solve manually because, you know, inside this maximization, you have like the entire model training process.
But if you can take that model training process and replace it with a data model prediction, then it becomes a much more tractable optimization problem.
Yeah, so I think one very nice thing that's a little bit specific to these linear data models is the fact that after you estimate them, what you have is sort of a vector whose length is the size of whatever pool of data you're sampling data sets from.
And the nice thing about these vectors is that each index in the vector corresponds to a single training example.
And the value of that index is exactly how important that training example was to a prediction.
And so what that means is that you can actually estimate these vectors for one learning algorithm.
And you can estimate them for another learning algorithm.
And the resulting representations that you get mean the same thing by default.
So you can component-wise compare these representations.
And we actually had some follow-up work led by a great student in our lab where we used this exact property to compare different learning algorithms, for example.