Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Yann LeCun

๐Ÿ‘ค Speaker
See mentions of this person in podcasts
1102 total appearances
Voice ID

Voice Profile Active

This person's voice can be automatically recognized across podcast episodes using AI voice matching.

Voice samples: 1
Confidence: Medium

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast
#416 โ€“ Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

So yeah, that's what a lot of people are working on. So the short answer is no. And the more complex answer is you can use all kinds of tricks to get an LLM to basically digest visual representations of images Or video, or audio for that matter. And a classical way of doing this is you train a vision system in some way.

Lex Fridman Podcast
#416 โ€“ Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

And we have a number of ways to train vision systems, either supervised, semi-supervised, self-supervised, all kinds of different ways. That will turn any image into a high-level representation. basically a list of tokens that are really similar to the kind of tokens that typical LLM takes as an input. And then you just feed that to the LLM in addition to the text.

Lex Fridman Podcast
#416 โ€“ Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

And we have a number of ways to train vision systems, either supervised, semi-supervised, self-supervised, all kinds of different ways. That will turn any image into a high-level representation. basically a list of tokens that are really similar to the kind of tokens that typical LLM takes as an input. And then you just feed that to the LLM in addition to the text.

Lex Fridman Podcast
#416 โ€“ Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

And we have a number of ways to train vision systems, either supervised, semi-supervised, self-supervised, all kinds of different ways. That will turn any image into a high-level representation. basically a list of tokens that are really similar to the kind of tokens that typical LLM takes as an input. And then you just feed that to the LLM in addition to the text.

Lex Fridman Podcast
#416 โ€“ Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

And you just expect the LLM to kind of, during training, to kind of be able to use those representations to help make decisions. I mean, there's been work along those lines for quite a long time. And now you see those systems, right?

Lex Fridman Podcast
#416 โ€“ Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

And you just expect the LLM to kind of, during training, to kind of be able to use those representations to help make decisions. I mean, there's been work along those lines for quite a long time. And now you see those systems, right?

Lex Fridman Podcast
#416 โ€“ Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

And you just expect the LLM to kind of, during training, to kind of be able to use those representations to help make decisions. I mean, there's been work along those lines for quite a long time. And now you see those systems, right?

Lex Fridman Podcast
#416 โ€“ Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

I mean, there are LLMs that have some vision extension, but they're basically hacks in the sense that those things are not like trained end-to-end to handle, to really understand the world. They're not trained with video, for example. They don't really understand intuitive physics, at least not at the moment.

Lex Fridman Podcast
#416 โ€“ Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

I mean, there are LLMs that have some vision extension, but they're basically hacks in the sense that those things are not like trained end-to-end to handle, to really understand the world. They're not trained with video, for example. They don't really understand intuitive physics, at least not at the moment.

Lex Fridman Podcast
#416 โ€“ Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

I mean, there are LLMs that have some vision extension, but they're basically hacks in the sense that those things are not like trained end-to-end to handle, to really understand the world. They're not trained with video, for example. They don't really understand intuitive physics, at least not at the moment.

Lex Fridman Podcast
#416 โ€“ Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

We're not going to be able to do this with the type of LLMs that we are working with today. And there's a number of reasons for this. But the main reason is The way LLMs are trained is that you take a piece of text, you remove some of the words in that text, you mask them, you replace them by blank markers, and you train a genetic neural net to predict the words that are missing.

Lex Fridman Podcast
#416 โ€“ Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

We're not going to be able to do this with the type of LLMs that we are working with today. And there's a number of reasons for this. But the main reason is The way LLMs are trained is that you take a piece of text, you remove some of the words in that text, you mask them, you replace them by blank markers, and you train a genetic neural net to predict the words that are missing.

Lex Fridman Podcast
#416 โ€“ Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

We're not going to be able to do this with the type of LLMs that we are working with today. And there's a number of reasons for this. But the main reason is The way LLMs are trained is that you take a piece of text, you remove some of the words in that text, you mask them, you replace them by blank markers, and you train a genetic neural net to predict the words that are missing.

Lex Fridman Podcast
#416 โ€“ Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

And if you build this neural net in a particular way so that it can only look at words that are to the left of the one it's trying to predict, then what you have is a system that basically is trying to predict the next word in a text, right? So then you can feed it a text, a prompt, and you can ask it to predict the next word. It can never predict the next word exactly.

Lex Fridman Podcast
#416 โ€“ Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

And if you build this neural net in a particular way so that it can only look at words that are to the left of the one it's trying to predict, then what you have is a system that basically is trying to predict the next word in a text, right? So then you can feed it a text, a prompt, and you can ask it to predict the next word. It can never predict the next word exactly.

Lex Fridman Podcast
#416 โ€“ Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

And if you build this neural net in a particular way so that it can only look at words that are to the left of the one it's trying to predict, then what you have is a system that basically is trying to predict the next word in a text, right? So then you can feed it a text, a prompt, and you can ask it to predict the next word. It can never predict the next word exactly.

Lex Fridman Podcast
#416 โ€“ Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

And so what it's going to do is produce a probability distribution over all the possible words in your dictionary. In fact, it doesn't predict words, it predicts tokens that are kind of subword units. And so it's easy to handle the uncertainty in the prediction there, because there is only a finite number of possible words in the dictionary, and you can just compute a distribution over them.

Lex Fridman Podcast
#416 โ€“ Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

And so what it's going to do is produce a probability distribution over all the possible words in your dictionary. In fact, it doesn't predict words, it predicts tokens that are kind of subword units. And so it's easy to handle the uncertainty in the prediction there, because there is only a finite number of possible words in the dictionary, and you can just compute a distribution over them.

Lex Fridman Podcast
#416 โ€“ Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

And so what it's going to do is produce a probability distribution over all the possible words in your dictionary. In fact, it doesn't predict words, it predicts tokens that are kind of subword units. And so it's easy to handle the uncertainty in the prediction there, because there is only a finite number of possible words in the dictionary, and you can just compute a distribution over them.

Lex Fridman Podcast
#416 โ€“ Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

Then what the system does is that it picks a word from that distribution. Of course, there's a higher chance of picking words that have a higher probability within that distribution. So you sample from the distribution to actually produce a word. And then you shift that word into the input. And so that allows the system not to predict the second word, right?