Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Andrew Ilyas

๐Ÿ‘ค Speaker
638 total appearances

Appearances Over Time

Podcast Appearances

Machine Learning Street Talk (MLST)
Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

Yeah, that's right.

Machine Learning Street Talk (MLST)
Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

So at least in our paper, we tried up to 300 million parameter models and got some pretty promising results.

Machine Learning Street Talk (MLST)
Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

I think we would like to do a lot more work on trying to understand where the limits are of this.

Machine Learning Street Talk (MLST)
Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

The track estimator for data models is definitely more efficient in the sense of if you train with 1,000 less models,

Machine Learning Street Talk (MLST)
Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

It'll work about as well, but it's not going to be sort of like its limits are not going to be as good as you could get with this regression based estimator.

Machine Learning Street Talk (MLST)
Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

And so I think trying to understand the limits and also trying to understand when or where it doesn't work or does work, I think is like is a great direction for future work.

Machine Learning Street Talk (MLST)
Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

Yeah, so I think part of the nice thing that we did in the track papers is crystallize some of our thoughts from the original data models paper about how we should actually go about evaluating these broad clouds of data attribution methods.

Machine Learning Street Talk (MLST)
Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

And so I think one broad problem in general with data attribution is that if I come up to you with some method for some data model and you come up to me with some other method for assigning value to data points, there's not really a great way of deciding whose is better.

Machine Learning Street Talk (MLST)
Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

And so I think one thing that we tried to put forward in the track paper, and this is almost taken directly from the original data models paper, is that the way we should evaluate these things is by using this correlation that I was talking about earlier between predicted model outputs and true model outputs.

Machine Learning Street Talk (MLST)
Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

And so once you do this sort of quantitative evaluation of data attribution methods, it sort of allows you to see this existing trade-off that was present in the literature between fast methods that were not super predictive of model predictions or of model behavior and extremely slow methods like the regression-based estimator I was talking about that were very predictive of model behavior.

Machine Learning Street Talk (MLST)
Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

And so you can sort of now view this goal of data attribution as like trying to trace out better and better Pareto frontiers of this tradeoff between efficacy and speed.

Machine Learning Street Talk (MLST)
Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

Yeah, so that's exactly this correlation that I was talking about.

Machine Learning Street Talk (MLST)
Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

So what that looks like is the thing you're evaluating is some function that maps from a test data point to a vector of scores, one for each training data point.

Machine Learning Street Talk (MLST)
Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

So this is like broadly.

Machine Learning Street Talk (MLST)
Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

what was recognized at the time as a data attribution method.

Machine Learning Street Talk (MLST)
Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

And there are a bunch of these.

Machine Learning Street Talk (MLST)
Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

Like you were saying, there's the Shapley value.

Machine Learning Street Talk (MLST)
Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

There's the influence function.

Machine Learning Street Talk (MLST)
Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

There's a whole bunch of these.

Machine Learning Street Talk (MLST)
Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

And the neat thing about all of these is that they all have an interpretation as a linear data model.