Bowen Baker

👤 Speaker

414 total appearances

Appearances Over Time

Podcast Appearances

The Neuron: AI Explained

OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

And this goes back to our also our conversation of, you know, talking about like what, you know, is the chain of thought or sorry, is the model doing something instinctively or is it like actually thinking through it actively in the chain of thought?

1498.9 View full episode →

The Neuron: AI Explained

OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

there are certain pressures or things that could happen while training that could kind of shift that balance probably.

1511.018 View full episode →

The Neuron: AI Explained

OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

And so one is, you know, this thing of like, if you train the model to never say something, think something bad,

1518.115 View full episode →

The Neuron: AI Explained

OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

We showed in a paper earlier this last year that it can in some cases actually still do a bad thing, but not think the bad thoughts anymore.

1526.155 View full episode →

The Neuron: AI Explained

OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

We think it probably can only do this in cases where the chain of thought isn't actually necessary to do the bad thing, going back to this question of necessity.

1536.795 View full episode →

The Neuron: AI Explained

OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

of like of the chain of thought um but it still shows that you know there's actually some propensity for models to reason out loud even when they don't need to and you can kind of destroy that propensity with style pressures like that um and so yeah the model and then we call that obfuscation like the model is like obfuscating it's actually it's it's thinking um

1545.19 View full episode →

The Neuron: AI Explained

OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

And then there's kind of a lot of other reasons we think that, you know, chains of thought could become like less information dense or have yet be less monitorable.

1567.229 View full episode →

The Neuron: AI Explained

OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

So, for instance, people are worried even about just large scale RL or reinforcement learning, which is the way we train these models kind of just in the normal way we're doing it.

1579.35 View full episode →

The Neuron: AI Explained

OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

You could imagine that.

1589.547 View full episode →

The Neuron: AI Explained

OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

as you increase the compute there, models could eventually learn their own languages or change the way that they're using the English language or whatever language they're thinking in.

1591.31 View full episode →

The Neuron: AI Explained

OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

At the current scale, and this is something we found in our recent paper,

1603.029 View full episode →

The Neuron: AI Explained

OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

It doesn't seem that worrying, but it is, you know, in the reinforcement learning literature, there are examples, like people were working on like emergent language and emergent communication back in like, I don't know, 2016 to 2018, 19 or so.

1608.117 View full episode →

The Neuron: AI Explained

OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

And, you know, they showed it in like very constrained settings that if you just put in models that have no language prior, they're just totally randomly initialized, they can learn some language to talk to each other with.

1621.595 View full episode →

The Neuron: AI Explained

OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

And...

1634.472 View full episode →

The Neuron: AI Explained

OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

Exactly.

1639.757 View full episode →

The Neuron: AI Explained

OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

Yeah, yeah, yeah.

1640.397 View full episode →

The Neuron: AI Explained

OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

And so, yeah, so, but this might happen where, like, the model would, like, learn its own internal language of how to think that's maybe, like, a bit more efficient or compressed than the English language.

1641.018 View full episode →

The Neuron: AI Explained

OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

And it's, like, then it's harder for us to understand ourselves.

1651.448 View full episode →

The Neuron: AI Explained

OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

Or a different model to understand that hasn't, like, learned that same exact language, specific language.

1654.731 View full episode →

The Neuron: AI Explained

OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

Yeah.

1681.338 View full episode →

← Previous Page 11 of 21 Next →

Report any issue