Tina Eliassi-Rad
👤 PersonAppearances Over Time
Podcast Appearances
And what are the uncertainties in terms of the predictions that I am outputting?
And what are the uncertainties in terms of the predictions that I am outputting?
Yeah, so basically you create a bunch of data and you get a buy-in from the community that these are good data sets to test a machine learning or an AI model on. And then there's a leaderboard and you want to be number one. Right. And so you hack the systems that exist or you hack your own system. You create your own to be number one, you know, as as much as possible.
Yeah, so basically you create a bunch of data and you get a buy-in from the community that these are good data sets to test a machine learning or an AI model on. And then there's a leaderboard and you want to be number one. Right. And so you hack the systems that exist or you hack your own system. You create your own to be number one, you know, as as much as possible.
And that's basically what is going on. And I like that.
And that's basically what is going on. And I like that.
this metaphor so my colleague um barabaschi said it's like there are two camps there's like a toolbox it's a finite toolbox right and the machine learning the ai people the engineers put tools into that toolbox and because it's finite it's very competitive that is my tool beats your tool even if it's like one percent by one percent that it's not clear if it's statistically significant or not and i may be king for only 30 seconds
this metaphor so my colleague um barabaschi said it's like there are two camps there's like a toolbox it's a finite toolbox right and the machine learning the ai people the engineers put tools into that toolbox and because it's finite it's very competitive that is my tool beats your tool even if it's like one percent by one percent that it's not clear if it's statistically significant or not and i may be king for only 30 seconds
because another tool comes in, right? And then there's like the scientists on the other end that just open the toolbox and say, okay, well, what is good for whatever, you know, whatever prediction task I want to do. And then they pick a tool out of that.
because another tool comes in, right? And then there's like the scientists on the other end that just open the toolbox and say, okay, well, what is good for whatever, you know, whatever prediction task I want to do. And then they pick a tool out of that.
And so a lot of this like benchmark hacking or state of the art hacking happens on the engineering, on the AI machine learning side, the computer science side, because you want your tool in that finite toolbox.
And so a lot of this like benchmark hacking or state of the art hacking happens on the engineering, on the AI machine learning side, the computer science side, because you want your tool in that finite toolbox.
It is a very big problem. I mean, there are multiple angles to this. So one is, for example, because of all the hype, oftentimes people on the engineering side don't talk about the assumptions that they have made or the technical limitations of their system. Because of that, we have this reproducibility problem.
It is a very big problem. I mean, there are multiple angles to this. So one is, for example, because of all the hype, oftentimes people on the engineering side don't talk about the assumptions that they have made or the technical limitations of their system. Because of that, we have this reproducibility problem.
So not even a replicability problem, but a reproducibility problem, which is just a code. Can I just reproduce your code as you have it? Right. And even with your training data, even with like how you broke it up with these different like folds or whatever, you know, and so which is like very, very, very low bar to pass.
So not even a replicability problem, but a reproducibility problem, which is just a code. Can I just reproduce your code as you have it? Right. And even with your training data, even with like how you broke it up with these different like folds or whatever, you know, and so which is like very, very, very low bar to pass.
But that doesn't happen because there are lots of assumptions that are being made, etc. Then there's this notion of we are living through this era of big models. I want a model that has many, many, many parameters, even if I don't need all those many parameters. Or for example, maybe I do care about interpretability. That is, I want to know what the model is actually doing.
But that doesn't happen because there are lots of assumptions that are being made, etc. Then there's this notion of we are living through this era of big models. I want a model that has many, many, many parameters, even if I don't need all those many parameters. Or for example, maybe I do care about interpretability. That is, I want to know what the model is actually doing.
But because, again, for that one or two percentage point on the prediction side, you let go of it and you go with the bigger models. But yes, it's a big, big problem. For me, the lowest bar would be that we require, at least with federal funding, and in some of the service that I do for the federal government, I've been pushing this.
But because, again, for that one or two percentage point on the prediction side, you let go of it and you go with the bigger models. But yes, it's a big, big problem. For me, the lowest bar would be that we require, at least with federal funding, and in some of the service that I do for the federal government, I've been pushing this.