Sebastian Siemiatkowski
๐ค SpeakerAppearances Over Time
Podcast Appearances
Kind of the same information over and over again, but on Wikipedia, it's just one article.
Kind of the same information over and over again, but on Wikipedia, it's just one article.
How do they do that?
How do they do that?
And I realized when you train the model, if I tell that Harry is not only running a fantastic podcast, but also runs a VC, that if you tell it once when you train it, it will forget it.
And I realized when you train the model, if I tell that Harry is not only running a fantastic podcast, but also runs a VC, that if you tell it once when you train it, it will forget it.
It will ignore that information.
It will ignore that information.
But if you tell it enough number of times, it will remember it.
But if you tell it enough number of times, it will remember it.
And then when you go and ask it, it will know that information.
And then when you go and ask it, it will know that information.
But it's not storing it twice.
But it's not storing it twice.
Because if it's getting the same information that it already knows, it doesn't move the tokens.
Because if it's getting the same information that it already knows, it doesn't move the tokens.
So it's automatically compressing all the information.
So it's automatically compressing all the information.
This is why you can take the whole freaking internet, all human knowledge, and compress it down to a few hundred gigabytes.
This is why you can take the whole freaking internet, all human knowledge, and compress it down to a few hundred gigabytes.