Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Yann LeCun

👤 Person
1086 total appearances

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast
#416 – Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

Very similar way, but you have to have this way of preventing collapse, of ensuring that there is high energy for things you don't train it on. And currently it's very implicit in LLM. It's done in a way that people don't realize is being done, but it is being done. It's due to the fact that when you give a high probability to a word,

Lex Fridman Podcast
#416 – Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

Very similar way, but you have to have this way of preventing collapse, of ensuring that there is high energy for things you don't train it on. And currently it's very implicit in LLM. It's done in a way that people don't realize is being done, but it is being done. It's due to the fact that when you give a high probability to a word,

Lex Fridman Podcast
#416 – Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

automatically you give low probability to other words because you only have a finite amount of probability to go around right there, to sum to one.

Lex Fridman Podcast
#416 – Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

automatically you give low probability to other words because you only have a finite amount of probability to go around right there, to sum to one.

Lex Fridman Podcast
#416 – Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

automatically you give low probability to other words because you only have a finite amount of probability to go around right there, to sum to one.

Lex Fridman Podcast
#416 – Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

So when you minimize the cross entropy or whatever, when you train your LLM to predict the next word, you're increasing the probability your system will give to the correct word, but you're also decreasing the probability it will give to the incorrect words. Now,

Lex Fridman Podcast
#416 – Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

So when you minimize the cross entropy or whatever, when you train your LLM to predict the next word, you're increasing the probability your system will give to the correct word, but you're also decreasing the probability it will give to the incorrect words. Now,

Lex Fridman Podcast
#416 – Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

So when you minimize the cross entropy or whatever, when you train your LLM to predict the next word, you're increasing the probability your system will give to the correct word, but you're also decreasing the probability it will give to the incorrect words. Now,

Lex Fridman Podcast
#416 – Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

Indirectly, that gives a high probability to sequences of words that are good and low probability to sequences of words that are bad, but it's very indirect. It's not obvious why this actually works at all, because you're not doing it on a joint probability of all the symbols in a sequence.

Lex Fridman Podcast
#416 – Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

Indirectly, that gives a high probability to sequences of words that are good and low probability to sequences of words that are bad, but it's very indirect. It's not obvious why this actually works at all, because you're not doing it on a joint probability of all the symbols in a sequence.

Lex Fridman Podcast
#416 – Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

Indirectly, that gives a high probability to sequences of words that are good and low probability to sequences of words that are bad, but it's very indirect. It's not obvious why this actually works at all, because you're not doing it on a joint probability of all the symbols in a sequence.

Lex Fridman Podcast
#416 – Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

You're just doing it to factorize that probability in terms of conditional probabilities over successive tokens.

Lex Fridman Podcast
#416 – Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

You're just doing it to factorize that probability in terms of conditional probabilities over successive tokens.

Lex Fridman Podcast
#416 – Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

You're just doing it to factorize that probability in terms of conditional probabilities over successive tokens.

Lex Fridman Podcast
#416 – Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

So we've been doing this with Ojepa architectures, basically. The joint embedding. Ojepa. So there, the compatibility between two things is, here's an image or a video, here's a corrupted, shifted, or transformed version of that image or video, or masked. And then the energy of the system is the prediction error of the representation

Lex Fridman Podcast
#416 – Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

So we've been doing this with Ojepa architectures, basically. The joint embedding. Ojepa. So there, the compatibility between two things is, here's an image or a video, here's a corrupted, shifted, or transformed version of that image or video, or masked. And then the energy of the system is the prediction error of the representation

Lex Fridman Podcast
#416 – Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

So we've been doing this with Ojepa architectures, basically. The joint embedding. Ojepa. So there, the compatibility between two things is, here's an image or a video, here's a corrupted, shifted, or transformed version of that image or video, or masked. And then the energy of the system is the prediction error of the representation

Lex Fridman Podcast
#416 – Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

the predicted representation of the good thing versus the actual representation of the good thing. So you run the corrupted image to the system, predict the representation of the good input, uncorrupted, and then compute the prediction error. That's the energy of the system. So this system will tell you this is a good representation

Lex Fridman Podcast
#416 – Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

the predicted representation of the good thing versus the actual representation of the good thing. So you run the corrupted image to the system, predict the representation of the good input, uncorrupted, and then compute the prediction error. That's the energy of the system. So this system will tell you this is a good representation

Lex Fridman Podcast
#416 – Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

the predicted representation of the good thing versus the actual representation of the good thing. So you run the corrupted image to the system, predict the representation of the good input, uncorrupted, and then compute the prediction error. That's the energy of the system. So this system will tell you this is a good representation