Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Illia Polosukhin

👤 Person
552 total appearances

Appearances Over Time

Podcast Appearances

The Neuron: AI Explained
Illia Polosukhin: Fixing the Broken System He Helped Create

Well, it's all kind of half made up and half is from experience.

The Neuron: AI Explained
Illia Polosukhin: Fixing the Broken System He Helped Create

They were trying to do something.

The Neuron: AI Explained
Illia Polosukhin: Fixing the Broken System He Helped Create

It didn't work.

The Neuron: AI Explained
Illia Polosukhin: Fixing the Broken System He Helped Create

They were changing a bunch of stuff until it worked.

The Neuron: AI Explained
Illia Polosukhin: Fixing the Broken System He Helped Create

And now they're not going to go and redo everything, figuring out if other options work.

The Neuron: AI Explained
Illia Polosukhin: Fixing the Broken System He Helped Create

They're just going to keep whatever worked.

The Neuron: AI Explained
Illia Polosukhin: Fixing the Broken System He Helped Create

Yeah.

The Neuron: AI Explained
Illia Polosukhin: Fixing the Broken System He Helped Create

And so like figuring out how to like go away from that.

The Neuron: AI Explained
Illia Polosukhin: Fixing the Broken System He Helped Create

And so RL is even worse.

The Neuron: AI Explained
Illia Polosukhin: Fixing the Broken System He Helped Create

RL is like literally, you know, we have no idea, but you know, hopefully like this reward function works, you know, we run it, it works great, you know, ship the paper, ship the model.

The Neuron: AI Explained
Illia Polosukhin: Fixing the Broken System He Helped Create

So it's a very like kind of semi-arbitrary.

The Neuron: AI Explained
Illia Polosukhin: Fixing the Broken System He Helped Create

There is no like actual science around reward distribution and kind of reward provocation.

The Neuron: AI Explained
Illia Polosukhin: Fixing the Broken System He Helped Create

Well, it does that.

The Neuron: AI Explained
Illia Polosukhin: Fixing the Broken System He Helped Create

It's also like, and so it's very prone to like errors because especially like there was like all this fun stories of, you know, your model figuring out that actually it can look in the file where the answers are if you give it like file system tools.

The Neuron: AI Explained
Illia Polosukhin: Fixing the Broken System He Helped Create

or search or anything, it actually finds out how to get the answers.

The Neuron: AI Explained
Illia Polosukhin: Fixing the Broken System He Helped Create

And this is way cheaper and better than actually thinking about stuff.

The Neuron: AI Explained
Illia Polosukhin: Fixing the Broken System He Helped Create

So this is why we kind of need a better kind of training mechanisms.

The Neuron: AI Explained
Illia Polosukhin: Fixing the Broken System He Helped Create

And that's why, again, from a research perspective, I look at fixed size model.

The Neuron: AI Explained
Illia Polosukhin: Fixing the Broken System He Helped Create

Can we make them better?

The Neuron: AI Explained
Illia Polosukhin: Fixing the Broken System He Helped Create

Because that effectively shows we have a better training procedure.