Andrew Ilyas

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

And so I think that was really cool, both because it's sort of this very distilled model of bias in the data collection process that doesn't have any of the complexity of deep learning and allows you to actually prove stuff.

4920.121 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

And how did it work?

4931.557 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

The algorithm itself is quite simple.

4932.526 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

You can actually just write down a loss function that captures this self-selection mechanism inside of the loss function.

4935.45 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

Rather than just minimizing your standard loss function, if you minimize this alternative loss function instead, what we showed is that you'll recover the actual ground truth parameters of what matters for hunting and fishing in my example.

4945.766 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

And then the technical challenges come in like proving that you can actually take gradients of this loss function and that doing gradient descent will actually converge and things like that.

4960.167 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

We assume we're doing linear regression and we have this like very well-defined sort of structural model that we're sampling from.

4969.8 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

So we have very strong assumptions on what the generative process of the data is.

4978.392 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

We assume there's no model misspecification or anything like that.

4983.686 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

We're really just trying to see whether we can recover the parameters underlying this generative process that we've assumed efficiently.

4986.614 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

And so, obviously, tons of work to be done along that axis, both along robustness to model misspecification, less restrictive settings, more complex models.

4993.512 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

But I think it's a really interesting sort of first stab at this question of can we efficiently recover from the data itself that we're collecting from people being biased for whatever reason?

5004.866 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

Yeah, especially in like sort of outside of the like language model, like in sort of more classical machine learning settings where you're like a bank or something and you're collecting people's data.

5019.618 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

Yeah, so I would say that the biggest adverse effects, like the biggest limitations sort of right now are exactly sort of this, you don't really know what's going to happen when the model is misspecified or when your model of how people are self-selecting is misspecified.

5035.008 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

So if you're a bank and you assume that someone's going to strategize in this way or strategize in that way and you try to account for it in your machine learning algorithm, if your conception of how they reported data is wrong

5053.204 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

then it's sort of unclear what parameters you're actually going to end up with.

5064.995 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

Yeah, absolutely.

5087.928 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

I think in general, this is exactly why we had to study this very restrictive sort of data generating process, where really your only goal is to recover the true parameters of this data generating process.

5089.69 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

I think when you get to sort of this machine learning regime, we don't really have a data generating process anymore, or not one that we can write down at least.

5102.661 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

And so it becomes a much trickier question of like, okay, we're going to do some accounting for self-selection or something like that, but we also want to account for

5109.988 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment