Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Nick Heiner

๐Ÿ‘ค Speaker
529 total appearances

Appearances Over Time

Podcast Appearances

The Neuron: AI Explained
Inside the Secret Labs Where AI Learns to Work

So the first is supervised fine tuning, which is basically teaching by demonstration.

The Neuron: AI Explained
Inside the Secret Labs Where AI Learns to Work

So the analogy here is you're learning to golf and you do it by watching a thousand hours on YouTube of golf.

The Neuron: AI Explained
Inside the Secret Labs Where AI Learns to Work

And then you just try to figure out what they're doing.

The Neuron: AI Explained
Inside the Secret Labs Where AI Learns to Work

Then there's reinforcement learning from human feedback, which is when you golf, you know, you have an instructor, you're the driving range, you take two shots and the coach tells you, OK, the first one was better.

The Neuron: AI Explained
Inside the Secret Labs Where AI Learns to Work

And they don't necessarily even tell you what was better about it.

The Neuron: AI Explained
Inside the Secret Labs Where AI Learns to Work

They just tell you one was better than the other.

The Neuron: AI Explained
Inside the Secret Labs Where AI Learns to Work

And you like you sort of try slightly different things every time and you start to converge on like what is the best thing to do.

The Neuron: AI Explained
Inside the Secret Labs Where AI Learns to Work

And then reinforcement learning environments takes it a step further.

The Neuron: AI Explained
Inside the Secret Labs Where AI Learns to Work

And so instead of you're the driving range and you're limited by the availability of the coach, which, you know, to sort of say what it actually is, it's like you have humans looking at two responses from a model and choosing, you know, thumbs up, thumbs down.

The Neuron: AI Explained
Inside the Secret Labs Where AI Learns to Work

But that requires humans, right?

The Neuron: AI Explained
Inside the Secret Labs Where AI Learns to Work

Like you have to spend millions of hours to do that.

The Neuron: AI Explained
Inside the Secret Labs Where AI Learns to Work

The reinforcement learning environment is you're sent out in the golf course by yourself.

The Neuron: AI Explained
Inside the Secret Labs Where AI Learns to Work

And you get feedback from the environment of like, okay, the ball went close to the target.

The Neuron: AI Explained
Inside the Secret Labs Where AI Learns to Work

Right.

The Neuron: AI Explained
Inside the Secret Labs Where AI Learns to Work

And in that way, you're able, again, to sort of self-teach in a sense, because you keep trying different things and then you keep getting that feedback of what worked and what didn't.

The Neuron: AI Explained
Inside the Secret Labs Where AI Learns to Work

And yeah, you do that for a million hours and then all of a sudden you're a world-class golfer.

The Neuron: AI Explained
Inside the Secret Labs Where AI Learns to Work

Yes.

The Neuron: AI Explained
Inside the Secret Labs Where AI Learns to Work

And that is exactly what they're doing is they are collecting your user feedback.

The Neuron: AI Explained
Inside the Secret Labs Where AI Learns to Work

And so we've it's actually somewhat funny.

The Neuron: AI Explained
Inside the Secret Labs Where AI Learns to Work

You know, we've had experts in our network who spend a lot of time, you know, going in a lot of detail into these responses to assess which ones are better and they get paid to do it.