Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Rob Wiblin

πŸ‘€ Speaker
3881 total appearances
Voice ID

Voice Profile Active

This person's voice can be automatically recognized across podcast episodes using AI voice matching.

Voice samples: 1
Confidence: Medium

Appearances Over Time

Podcast Appearances

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

So just to check that I've understood, you're saying because we are using GPUs and TPUs structurally, that forces the thought of the model to be, I guess, very deep.

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

It can like have many things in its mind at one time, but there can't be very many steps because you have to be going through all of these things in parallel.

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

But if

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

if it was very wide, then you wouldn't be able to do them all simultaneously.

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

You would have to wait until the earlier steps were done to do the later steps.

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

And this is just something that is going to remain the case for years to come.

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

So I don't fully understand this idea of continuous chain of thought, but isn't there this notion that basically at the end of thinking for a little bit, currently we force the models to output a word, a token, and then we feed that back into the start of the model again.

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

But why, rather than compress all of its thoughts down into a single token or word or whatever, why don't we just keep the full distribution of all of the thoughts and then feed them back into the beginning again?

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

Wouldn't that allow it to kind of preserve more information rather than basically throwing a bunch of it out?

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

And if that was a much more effective way of thinking and reasoning, and then you have no stage where you're actually outputting a token that a human being can read, then wouldn't that be a force that would potentially make them much more opaque to us?

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

So you're saying, even if you are using continuous chain of thought, you can still go back and say, well, what if we had forced output tokens at all of these intermediate stages?

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

What would that probably have looked like?

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

Let's just take the most likely word or the most likely token at each step and then read that.

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

I guess there might be a concern that it could hide a second kind of,

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

track of thought in the tail of the probability distribution, things that you wouldn't be likely to read.

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

But you're saying it seems like it kind of only has actually one train of thought here.

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

There's not like a second hidden chain of thought that you wouldn't be able to inspect.

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

Well, I guess even setting aside whether it performs as well according to some benchmark, you could at least see... You're saying keep just the top word or the top few tokens, the few most probable ones, throw out everything else.

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

You could see whether it does a different thing.

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

Does it lead to a different recommendation or a different outcome?