Andrew Ilyas

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

Yeah, that's right.

1715.864 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

So at least in our paper, we tried up to 300 million parameter models and got some pretty promising results.

1717.006 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

I think we would like to do a lot more work on trying to understand where the limits are of this.

1724.359 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

The track estimator for data models is definitely more efficient in the sense of if you train with 1,000 less models,

1730.85 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

It'll work about as well, but it's not going to be sort of like its limits are not going to be as good as you could get with this regression based estimator.

1736.941 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

And so I think trying to understand the limits and also trying to understand when or where it doesn't work or does work, I think is like is a great direction for future work.

1744.049 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

Yeah, so I think part of the nice thing that we did in the track papers is crystallize some of our thoughts from the original data models paper about how we should actually go about evaluating these broad clouds of data attribution methods.

1758.912 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

And so I think one broad problem in general with data attribution is that if I come up to you with some method for some data model and you come up to me with some other method for assigning value to data points, there's not really a great way of deciding whose is better.

1775.281 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

And so I think one thing that we tried to put forward in the track paper, and this is almost taken directly from the original data models paper, is that the way we should evaluate these things is by using this correlation that I was talking about earlier between predicted model outputs and true model outputs.

1793.354 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

And so once you do this sort of quantitative evaluation of data attribution methods, it sort of allows you to see this existing trade-off that was present in the literature between fast methods that were not super predictive of model predictions or of model behavior and extremely slow methods like the regression-based estimator I was talking about that were very predictive of model behavior.

1809.993 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

And so you can sort of now view this goal of data attribution as like trying to trace out better and better Pareto frontiers of this tradeoff between efficacy and speed.

1835.339 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

Yeah, so that's exactly this correlation that I was talking about.

1852.63 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

So what that looks like is the thing you're evaluating is some function that maps from a test data point to a vector of scores, one for each training data point.

1857.455 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

So this is like broadly.

1871.57 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

what was recognized at the time as a data attribution method.

1873.392 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

And there are a bunch of these.