Brandon Tseng
👤 SpeakerAppearances Over Time
Podcast Appearances
These autonomous systems are doing the exact same thing.
And just like we learn from our experiences, these things are learning from their experiences.
Yeah, so one of the principal ways that we developed a methodology called reinforcement learning.
And so this was pioneered, there's probably like academic papers, I would say like it became pretty famous with like AlphaGo, AlphaStar, where it was, they would put these AlphaGo, you know, beat a world champion at the game of Go, right?
It's any time, reinforcement learning is really,
and a good learning methodology when the number of variables or outcomes are massive, right?
And they multiple like trillions of outcomes, right?
Which I couldn't bring up what the outcomes of the game of Go are, but right?
you know, it's almost uncountable.
Yeah.
Right.
Same thing with like the game of Starcraft, which was alpha star.
It's like the number of like moves are uncountable in terms of like the number of scenarios.
That's where you find something like reinforcement learning really shine right in the real world.
The number of moves that you have are uncountable.
Um, and so.
So what you're doing is you're taking, you know, you basically give it a goal.
And then behaviors that you believe are positive, you reinforce those behaviors.
You reward those behaviors.
And then things that are negative, like you don't reward those behaviors.