Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Bowen Baker

👤 Speaker
414 total appearances

Appearances Over Time

Podcast Appearances

The Neuron: AI Explained
OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

So that's probably like how they would be implemented in practice.

The Neuron: AI Explained
OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

And I can, do you want to, if you have any questions on that, we can talk about that and then I can talk about obfuscation after.

The Neuron: AI Explained
OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

Oh, yeah.

The Neuron: AI Explained
OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

So we don't, I think there's good reason to not reveal the chain of thought to users.

The Neuron: AI Explained
OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

I mean, there's two cited reasons, which, so the first is, if we showed users chain of thought, and this is to your question of like, should the user be monitoring this themselves?

The Neuron: AI Explained
OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

If we showed the user the chain of thought, for the same reason, we have to kind of like conform the outputs to look nice.

The Neuron: AI Explained
OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

So that, you know, like we don't kind of like say something offensive to a user or tell them how to do something illicit or anything like that.

The Neuron: AI Explained
OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

Whatever the product policy things are for, you know, in any labs product.

The Neuron: AI Explained
OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

We would have to then do that to the chain of thought because now we're giving them this new piece of this new thing.

The Neuron: AI Explained
OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

And, you know, if the chain of thought says something offensive or like actually reveals the how to do, you know, how to make the bomb or whatever the illicit thing is.

The Neuron: AI Explained
OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

but decides not to tell the user that, well, we've just told the user that if we give them a chain of thought.

The Neuron: AI Explained
OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

And so then that would require us to put these style pressures on the chain of thought.

The Neuron: AI Explained
OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

And that would then maybe lead to obfuscation, which we can talk about in a sec.

The Neuron: AI Explained
OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

And then the other one is, I think, just like distillation risks.

The Neuron: AI Explained
OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

So, you know, these chains of thought kind of give a lot of the reasoning of a model.

The Neuron: AI Explained
OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

And I think you I guess generally we don't try to just like give our model to other people to use.

The Neuron: AI Explained
OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

So.

The Neuron: AI Explained
OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

So, yeah.

The Neuron: AI Explained
OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

So this is like when you kind of in the intro, you were saying you mentioned that chain of thought could be fragile.

The Neuron: AI Explained
OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

And I think that is a big worry in the community right now.