Yann LeCun
👤 PersonAppearances Over Time
Podcast Appearances
for a number of reasons. The first is that there is a number of characteristics of intelligent behavior. For example, the capacity to understand the world, understand the physical world, the ability to remember and retrieve things, persistent memory, the ability to reason, and the ability to plan. Those are four essential characteristics of intelligent systems or entities, humans, animals.
for a number of reasons. The first is that there is a number of characteristics of intelligent behavior. For example, the capacity to understand the world, understand the physical world, the ability to remember and retrieve things, persistent memory, the ability to reason, and the ability to plan. Those are four essential characteristics of intelligent systems or entities, humans, animals.
for a number of reasons. The first is that there is a number of characteristics of intelligent behavior. For example, the capacity to understand the world, understand the physical world, the ability to remember and retrieve things, persistent memory, the ability to reason, and the ability to plan. Those are four essential characteristics of intelligent systems or entities, humans, animals.
LLMs can do none of those, or they can only do them in a very primitive way. They don't really understand the physical world. They don't really have persistent memory. They can't really reason, and they certainly can't plan. If you expect the system to become intelligent just without having the possibility of doing those things, you're making a mistake.
LLMs can do none of those, or they can only do them in a very primitive way. They don't really understand the physical world. They don't really have persistent memory. They can't really reason, and they certainly can't plan. If you expect the system to become intelligent just without having the possibility of doing those things, you're making a mistake.
LLMs can do none of those, or they can only do them in a very primitive way. They don't really understand the physical world. They don't really have persistent memory. They can't really reason, and they certainly can't plan. If you expect the system to become intelligent just without having the possibility of doing those things, you're making a mistake.
That is not to say that autoregressive LLMs are not useful. They're certainly useful. That they're not interesting, that we can't build a whole ecosystem of applications around them. Of course we can, but as it paths towards human-level intelligence, they're missing essential components. And then there is another tidbit or fact that I think is very interesting.
That is not to say that autoregressive LLMs are not useful. They're certainly useful. That they're not interesting, that we can't build a whole ecosystem of applications around them. Of course we can, but as it paths towards human-level intelligence, they're missing essential components. And then there is another tidbit or fact that I think is very interesting.
That is not to say that autoregressive LLMs are not useful. They're certainly useful. That they're not interesting, that we can't build a whole ecosystem of applications around them. Of course we can, but as it paths towards human-level intelligence, they're missing essential components. And then there is another tidbit or fact that I think is very interesting.
Those LLMs are trained on enormous amounts of text, basically the entirety of all publicly available texts on the internet, right? That's typically on the order of 10 to the 13 tokens. Each token is typically two bytes. So that's two 10 to the 13 bytes as training data. It would take you or me 170,000 years to just read through this at eight hours a day.
Those LLMs are trained on enormous amounts of text, basically the entirety of all publicly available texts on the internet, right? That's typically on the order of 10 to the 13 tokens. Each token is typically two bytes. So that's two 10 to the 13 bytes as training data. It would take you or me 170,000 years to just read through this at eight hours a day.
Those LLMs are trained on enormous amounts of text, basically the entirety of all publicly available texts on the internet, right? That's typically on the order of 10 to the 13 tokens. Each token is typically two bytes. So that's two 10 to the 13 bytes as training data. It would take you or me 170,000 years to just read through this at eight hours a day.
So it seems like an enormous amount of knowledge that those systems can accumulate. But then you realize it's really not that much data. If you talk to developmental psychologists and they tell you a four-year-old has been awake for 16,000 hours in his or her life, and the amount of information that has reached the visual cortex of that child in four years... is about 10 to the 15 bytes.
So it seems like an enormous amount of knowledge that those systems can accumulate. But then you realize it's really not that much data. If you talk to developmental psychologists and they tell you a four-year-old has been awake for 16,000 hours in his or her life, and the amount of information that has reached the visual cortex of that child in four years... is about 10 to the 15 bytes.
So it seems like an enormous amount of knowledge that those systems can accumulate. But then you realize it's really not that much data. If you talk to developmental psychologists and they tell you a four-year-old has been awake for 16,000 hours in his or her life, and the amount of information that has reached the visual cortex of that child in four years... is about 10 to the 15 bytes.
And you can compute this by estimating that the optical nerve carry about 20 megabytes per second, roughly. And so 10 to the 15 bytes for a four-year-old versus two times 10 to the 13 bytes for 170,000 years worth of reading What that tells you is that through sensory input, we see a lot more information than we do through language.
And you can compute this by estimating that the optical nerve carry about 20 megabytes per second, roughly. And so 10 to the 15 bytes for a four-year-old versus two times 10 to the 13 bytes for 170,000 years worth of reading What that tells you is that through sensory input, we see a lot more information than we do through language.
And you can compute this by estimating that the optical nerve carry about 20 megabytes per second, roughly. And so 10 to the 15 bytes for a four-year-old versus two times 10 to the 13 bytes for 170,000 years worth of reading What that tells you is that through sensory input, we see a lot more information than we do through language.
And that despite our intuition, most of what we learn and most of our knowledge is through our observation and interaction with the real world, not through language. Everything that we learn in the first few years of life and certainly everything that animals learn has nothing to do with language.
And that despite our intuition, most of what we learn and most of our knowledge is through our observation and interaction with the real world, not through language. Everything that we learn in the first few years of life and certainly everything that animals learn has nothing to do with language.