Andrew Ilyas

And I think disentangling between those two things requires looking beyond performance into things like trying to understand where in the training data predictions are coming from, trying to understand what exactly are the steps of a problem that were present in the training data versus what's new.

2846.405 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

And for what's new, where is that new behavior actually coming from?

2866.486 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

And so one particularly interesting direction, I think, is trying to understand from a data perspective where exactly these surprising, amazing advancements of large language models are coming from, because they're clearly not

2870.251 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

I think it's slightly reductionist now to say that they're copy pasting from the training data, but that doesn't mean that they're reasoning or it doesn't mean that anything crazy is happening.

2884.711 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

It could just be that they've learned a good enough abstraction to copy paste from the training data in a much more abstract way.

2896.646 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

And then you get into this philosophical discussion of like, is that what humans are doing?

2904.135 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

And I don't want to think about that.

2908.02 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

Yeah, I think I totally agree.

2932.501 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

And that's a big driving force, I think, behind our work.

2934.163 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

I'm very curious to see what happens when you apply the methods you're developing to these large language models and the ones that we claim are doing reasoning and things like that.

2939.049 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

I think the only thing stopping us right now is just not having quite fast enough methods to actually scale.

2953.465 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

I think there's a really nice paper out of Anthropic that exactly tried to do this for Claude, I believe, to understand where different behaviors are coming from the training data.

2959.973 View full episode →

Machine Learning Street Talk (MLST)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

But even then, these methods are so expensive that I don't think they were able to do, for example, actual data counterfactuals.

2971.367 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment