Tina Eliassi-Rad
👤 PersonAppearances Over Time
Podcast Appearances
And we were like, well, can we write stories for these people in a way and then feed it to what is the heart of these large language models, a transformer model, which is basically just the architecture of a neural network that learns association weights for within some context window.
And we were like, well, can we write stories for these people in a way and then feed it to what is the heart of these large language models, a transformer model, which is basically just the architecture of a neural network that learns association weights for within some context window.
um and that's what we did so but instead of so for example chat gpt goes online and gobbles up all this bad data that that or that people have put in all the misogynistic sexist data we didn't do that so we had very good data from this department of statistics and we created our own artificial symbolic language
um and that's what we did so but instead of so for example chat gpt goes online and gobbles up all this bad data that that or that people have put in all the misogynistic sexist data we didn't do that so we had very good data from this department of statistics and we created our own artificial symbolic language
And then we fit that artificial symbolic language for these six million people into a transformer model. And then we were able to predict life events. And so one of them that caught the media's eye was, will somebody between the age of 35 and 65 pass away in the next four years? And we picked that age range because that's a harder age range to predict for.
And then we fit that artificial symbolic language for these six million people into a transformer model. And then we were able to predict life events. And so one of them that caught the media's eye was, will somebody between the age of 35 and 65 pass away in the next four years? And we picked that age range because that's a harder age range to predict for.
Like if you're over 65, then it's easier to predict whether you're going to pass away in the next four years. And if you're younger than 35, it's also easy. The other, right, you're unlikely to pass away. And so that's one of the things. The other prediction task was like, will you leave Denmark? You know, so then you can predict for that.
Like if you're over 65, then it's easier to predict whether you're going to pass away in the next four years. And if you're younger than 35, it's also easy. The other, right, you're unlikely to pass away. And so that's one of the things. The other prediction task was like, will you leave Denmark? You know, so then you can predict for that.
But it had this similar technology as these large language models, which is like you have this one, what they call like predefined, where you just learn based on the data that you have what's likely to happen next. And then you fine tune it for whatever prediction task that you have.
But it had this similar technology as these large language models, which is like you have this one, what they call like predefined, where you just learn based on the data that you have what's likely to happen next. And then you fine tune it for whatever prediction task that you have.
It's a logical encoding because the data that the Department of Statistics has in Denmark is all tables. So it is not like this kind of sequence. So then you could say, like, Tina was born in Copenhagen in December, blah, blah, blah, right? And we could generate a natural language, but that's difficult. Why would we do that?
It's a logical encoding because the data that the Department of Statistics has in Denmark is all tables. So it is not like this kind of sequence. So then you could say, like, Tina was born in Copenhagen in December, blah, blah, blah, right? And we could generate a natural language, but that's difficult. Why would we do that?
So then we generated a vocabulary for this artificial symbolic language, and then we And that was actually a lot of the intellectual property of the work is like, okay, well, how do you take these tables and then create this artificial symbolic language that then you can give to a transformer model?
So then we generated a vocabulary for this artificial symbolic language, and then we And that was actually a lot of the intellectual property of the work is like, okay, well, how do you take these tables and then create this artificial symbolic language that then you can give to a transformer model?
Well, the thing that we found, which was very interesting, I think, so like the accuracy in terms of the model was about like 78%, et cetera. And I think that's why people were showing a lot of interest in it. But to me, that wasn't really the takeaway.
Well, the thing that we found, which was very interesting, I think, so like the accuracy in terms of the model was about like 78%, et cetera. And I think that's why people were showing a lot of interest in it. But to me, that wasn't really the takeaway.
The takeaway actually was that labor data is a very good indication of whether somebody in that age range is going to pass away in the next four years or not, because health data is very noisy and inconsistent. So even in Denmark, where they have universal health care, it's not like everybody goes to the doctor all the time and you have good data for them.
The takeaway actually was that labor data is a very good indication of whether somebody in that age range is going to pass away in the next four years or not, because health data is very noisy and inconsistent. So even in Denmark, where they have universal health care, it's not like everybody goes to the doctor all the time and you have good data for them.
And then the other stuff was basically just which sector you were working in. Right. So if you're like an electrician. It's a bad thing. It's not a very good thing. Right. As opposed to like an office worker. So the labor data was actually very, very helpful than the health data.
And then the other stuff was basically just which sector you were working in. Right. So if you're like an electrician. It's a bad thing. It's not a very good thing. Right. As opposed to like an office worker. So the labor data was actually very, very helpful than the health data.