Yann LeCun
👤 PersonAppearances Over Time
Podcast Appearances
If this is a good image and this is a corrupted version, it will give you zero energy if those two things are effectively, one of them is a corrupted version of the other. It gives you a high energy if the two images are completely different.
If this is a good image and this is a corrupted version, it will give you zero energy if those two things are effectively, one of them is a corrupted version of the other. It gives you a high energy if the two images are completely different.
If this is a good image and this is a corrupted version, it will give you zero energy if those two things are effectively, one of them is a corrupted version of the other. It gives you a high energy if the two images are completely different.
And we know it does because then we use those representations as input to a classification system.
And we know it does because then we use those representations as input to a classification system.
And we know it does because then we use those representations as input to a classification system.
I don't hate reinforcement learning, and I think it should not be abandoned completely, but I think its use should be minimized because it's incredibly inefficient in terms of samples. And so the proper way to train a system is to first have it learn good representations of the world and world models from mostly observation, maybe a little bit of interactions.
I don't hate reinforcement learning, and I think it should not be abandoned completely, but I think its use should be minimized because it's incredibly inefficient in terms of samples. And so the proper way to train a system is to first have it learn good representations of the world and world models from mostly observation, maybe a little bit of interactions.
I don't hate reinforcement learning, and I think it should not be abandoned completely, but I think its use should be minimized because it's incredibly inefficient in terms of samples. And so the proper way to train a system is to first have it learn good representations of the world and world models from mostly observation, maybe a little bit of interactions.
Yeah, now there's two things you can use. If you've learned a world model, you can use the world model to plan a sequence of actions to arrive at a particular objective. You don't need RL unless the way you measure whether you succeed might be inexact. Your idea of, you know, whether you were going to fall from your bike.
Yeah, now there's two things you can use. If you've learned a world model, you can use the world model to plan a sequence of actions to arrive at a particular objective. You don't need RL unless the way you measure whether you succeed might be inexact. Your idea of, you know, whether you were going to fall from your bike.
Yeah, now there's two things you can use. If you've learned a world model, you can use the world model to plan a sequence of actions to arrive at a particular objective. You don't need RL unless the way you measure whether you succeed might be inexact. Your idea of, you know, whether you were going to fall from your bike.
might be wrong, or whether the person you're fighting with MMA was going to do something and then do something else. So there's two ways you can be wrong. Either your objective function does not reflect the actual objective function you want to optimize, or your world model is inaccurate. So the prediction you were making about what was going to happen in the world is inaccurate.
might be wrong, or whether the person you're fighting with MMA was going to do something and then do something else. So there's two ways you can be wrong. Either your objective function does not reflect the actual objective function you want to optimize, or your world model is inaccurate. So the prediction you were making about what was going to happen in the world is inaccurate.
might be wrong, or whether the person you're fighting with MMA was going to do something and then do something else. So there's two ways you can be wrong. Either your objective function does not reflect the actual objective function you want to optimize, or your world model is inaccurate. So the prediction you were making about what was going to happen in the world is inaccurate.
So if you want to adjust your world model while you are operating in the world or your objective function, that is basically in the realm of RL. This is what RL deals with to some extent, right? So adjust your world model. And the way to adjust your world model, even in advance, is to explore parts of the space where you know that your world model is inaccurate.
So if you want to adjust your world model while you are operating in the world or your objective function, that is basically in the realm of RL. This is what RL deals with to some extent, right? So adjust your world model. And the way to adjust your world model, even in advance, is to explore parts of the space where you know that your world model is inaccurate.
So if you want to adjust your world model while you are operating in the world or your objective function, that is basically in the realm of RL. This is what RL deals with to some extent, right? So adjust your world model. And the way to adjust your world model, even in advance, is to explore parts of the space where you know that your world model is inaccurate.
That's called curiosity, basically, or play, right? When you play, you kind of explore part of the state space that you don't want to do for real because it might be dangerous, but you can adjust your world model without killing yourself, basically. So that's what you want to use RL for. When it comes time to learning a particular task,
That's called curiosity, basically, or play, right? When you play, you kind of explore part of the state space that you don't want to do for real because it might be dangerous, but you can adjust your world model without killing yourself, basically. So that's what you want to use RL for. When it comes time to learning a particular task,