Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Ilya Shumailov

👤 Person
87 total appearances

Appearances Over Time

Podcast Appearances

Short Wave
When AI Cannibalizes Its Data

Large language models are statistical beasts that learn from example of human written text and learn to produce text that is similar to the ones that the model was taught.

Short Wave
When AI Cannibalizes Its Data

Large language models are statistical beasts that learn from example of human written text and learn to produce text that is similar to the ones that the model was taught.

Short Wave
When AI Cannibalizes Its Data

Large language models are statistical beasts that learn from example of human written text and learn to produce text that is similar to the ones that the model was taught.

Short Wave
When AI Cannibalizes Its Data

If you were today to sample data from internet randomly, I'm sure you'll find that a bigger proportion of it is generated by machines. But this is not to say that the data itself is bad. The main question is how much of this data is generated

Short Wave
When AI Cannibalizes Its Data

If you were today to sample data from internet randomly, I'm sure you'll find that a bigger proportion of it is generated by machines. But this is not to say that the data itself is bad. The main question is how much of this data is generated

Short Wave
When AI Cannibalizes Its Data

If you were today to sample data from internet randomly, I'm sure you'll find that a bigger proportion of it is generated by machines. But this is not to say that the data itself is bad. The main question is how much of this data is generated

Short Wave
When AI Cannibalizes Its Data

Quite a lot of these models, especially back at the time, they're relatively low quality. So there are errors and there are biases. There are systematic biases inside of those models. And thus, you can kind of imagine the case where rather than learning useful contexts, and useful concepts, you can actually learn things that don't exist. They are purely hallucinations.

Short Wave
When AI Cannibalizes Its Data

Quite a lot of these models, especially back at the time, they're relatively low quality. So there are errors and there are biases. There are systematic biases inside of those models. And thus, you can kind of imagine the case where rather than learning useful contexts, and useful concepts, you can actually learn things that don't exist. They are purely hallucinations.

Short Wave
When AI Cannibalizes Its Data

Quite a lot of these models, especially back at the time, they're relatively low quality. So there are errors and there are biases. There are systematic biases inside of those models. And thus, you can kind of imagine the case where rather than learning useful contexts, and useful concepts, you can actually learn things that don't exist. They are purely hallucinations.

Short Wave
When AI Cannibalizes Its Data

In simple theoretical setups, we consider it, you're guaranteed to collapse.

Short Wave
When AI Cannibalizes Its Data

In simple theoretical setups, we consider it, you're guaranteed to collapse.

Short Wave
When AI Cannibalizes Its Data

In simple theoretical setups, we consider it, you're guaranteed to collapse.

Short Wave
When AI Cannibalizes Its Data

So there are three sources, three primary sources of error that we still have. So the very first one is basically just data-associated errors. And usually those are questions along the lines of, do we have enough data to approximate a given process? So if some things happen very infrequently in your underlying distribution, your model may get a wrong perception that, like,

Short Wave
When AI Cannibalizes Its Data

So there are three sources, three primary sources of error that we still have. So the very first one is basically just data-associated errors. And usually those are questions along the lines of, do we have enough data to approximate a given process? So if some things happen very infrequently in your underlying distribution, your model may get a wrong perception that, like,

Short Wave
When AI Cannibalizes Its Data

So there are three sources, three primary sources of error that we still have. So the very first one is basically just data-associated errors. And usually those are questions along the lines of, do we have enough data to approximate a given process? So if some things happen very infrequently in your underlying distribution, your model may get a wrong perception that, like,

Short Wave
When AI Cannibalizes Its Data

that some things are impossible wait what do you mean by they are impossible like an example i've seen on twitter was uh if you google for a baby peacock you'll discover pictures of birds that look relatively realistic but they are not peacocks at all they are completely generated and you will not find a real picture but if you try learning anything from it of course you're

Short Wave
When AI Cannibalizes Its Data

that some things are impossible wait what do you mean by they are impossible like an example i've seen on twitter was uh if you google for a baby peacock you'll discover pictures of birds that look relatively realistic but they are not peacocks at all they are completely generated and you will not find a real picture but if you try learning anything from it of course you're

Short Wave
When AI Cannibalizes Its Data

that some things are impossible wait what do you mean by they are impossible like an example i've seen on twitter was uh if you google for a baby peacock you'll discover pictures of birds that look relatively realistic but they are not peacocks at all they are completely generated and you will not find a real picture but if you try learning anything from it of course you're

Short Wave
When AI Cannibalizes Its Data

I've got to be absorbing this bias.

Short Wave
When AI Cannibalizes Its Data

I've got to be absorbing this bias.

← Previous Page 1 of 5 Next →