Nick Heiner

The Neuron: AI Explained

Inside the Secret Labs Where AI Learns to Work

So the first is supervised fine tuning, which is basically teaching by demonstration.

316.07 View full episode →

The Neuron: AI Explained

Inside the Secret Labs Where AI Learns to Work

So the analogy here is you're learning to golf and you do it by watching a thousand hours on YouTube of golf.

321.376 View full episode →

The Neuron: AI Explained

Inside the Secret Labs Where AI Learns to Work

And then you just try to figure out what they're doing.

329.265 View full episode →

The Neuron: AI Explained

Inside the Secret Labs Where AI Learns to Work

Then there's reinforcement learning from human feedback, which is when you golf, you know, you have an instructor, you're the driving range, you take two shots and the coach tells you, OK, the first one was better.

332.367 View full episode →

The Neuron: AI Explained

Inside the Secret Labs Where AI Learns to Work

And they don't necessarily even tell you what was better about it.

345.469 View full episode →

The Neuron: AI Explained

Inside the Secret Labs Where AI Learns to Work

They just tell you one was better than the other.

347.232 View full episode →

The Neuron: AI Explained

Inside the Secret Labs Where AI Learns to Work

And you like you sort of try slightly different things every time and you start to converge on like what is the best thing to do.

349.416 View full episode →

The Neuron: AI Explained

Inside the Secret Labs Where AI Learns to Work

And then reinforcement learning environments takes it a step further.

357.697 View full episode →

The Neuron: AI Explained

Inside the Secret Labs Where AI Learns to Work

And so instead of you're the driving range and you're limited by the availability of the coach, which, you know, to sort of say what it actually is, it's like you have humans looking at two responses from a model and choosing, you know, thumbs up, thumbs down.

361.841 View full episode →

The Neuron: AI Explained

Inside the Secret Labs Where AI Learns to Work

But that requires humans, right?

377.394 View full episode →

The Neuron: AI Explained

Inside the Secret Labs Where AI Learns to Work

Like you have to spend millions of hours to do that.

378.955 View full episode →

The Neuron: AI Explained

Inside the Secret Labs Where AI Learns to Work

The reinforcement learning environment is you're sent out in the golf course by yourself.

383.319 View full episode →

The Neuron: AI Explained

Inside the Secret Labs Where AI Learns to Work

And you get feedback from the environment of like, okay, the ball went close to the target.

387.143 View full episode →

The Neuron: AI Explained

Inside the Secret Labs Where AI Learns to Work

Right.

394.193 View full episode →

The Neuron: AI Explained

Inside the Secret Labs Where AI Learns to Work

And in that way, you're able, again, to sort of self-teach in a sense, because you keep trying different things and then you keep getting that feedback of what worked and what didn't.

395.695 View full episode →

The Neuron: AI Explained

Inside the Secret Labs Where AI Learns to Work

And yeah, you do that for a million hours and then all of a sudden you're a world-class golfer.

404.306 View full episode →

The Neuron: AI Explained

Inside the Secret Labs Where AI Learns to Work

Yes.

415.358 View full episode →

The Neuron: AI Explained

Inside the Secret Labs Where AI Learns to Work

And that is exactly what they're doing is they are collecting your user feedback.

415.859 View full episode →

The Neuron: AI Explained

Inside the Secret Labs Where AI Learns to Work

And so we've it's actually somewhat funny.

419.905 View full episode →

The Neuron: AI Explained

Inside the Secret Labs Where AI Learns to Work

You know, we've had experts in our network who spend a lot of time, you know, going in a lot of detail into these responses to assess which ones are better and they get paid to do it.

423.511 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment