Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Rob Wiblin

πŸ‘€ Speaker
3881 total appearances
Voice ID

Voice Profile Active

This person's voice can be automatically recognized across podcast episodes using AI voice matching.

Voice samples: 1
Confidence: Medium

Appearances Over Time

Podcast Appearances

80,000 Hours Podcast
I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

That's right.

80,000 Hours Podcast
I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

So I think the gloss, when I heard about this idea nine, 12 months ago, I think the gloss that I got was the core thing is that the scientist AI is not an agent, that it is indifferent about states of the world.

80,000 Hours Podcast
I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

Like a weather forecasting model doesn't care what the weather is.

80,000 Hours Podcast
I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

It just tries to predict what the weather is going to be.

80,000 Hours Podcast
I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

And this kind of model, it would spit out probabilities of things being true or false, but it wouldn't care what state of the world it is in.

80,000 Hours Podcast
I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

And it wouldn't be able to take actions by design.

80,000 Hours Podcast
I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

Is that kind of a core part in your mind?

80,000 Hours Podcast
I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

I think I understand, as I understand it, you think actually this is maybe more consistent with agency than people have appreciated.

80,000 Hours Podcast
I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

I think at one point that was a criticism of the plan was that it would be too easy to convert this kind of oracle into an agent because you would just be able to ask the oracle, well, would we accomplish this goal if we took this action?

80,000 Hours Podcast
I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

And it would give you the probability and you could just try to increase that probability and choose that action.

80,000 Hours Podcast
I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

Is the idea that you would do something like that basically, but you would be able to preserve some of the safety characteristics of the original model?

80,000 Hours Podcast
I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

So the concern is that the predictor might be able to figure out basically gaps in the guardrail or weaknesses in the guardrail and exploit those if it- Not the predictor, a policy that is trained separately and in a way that is adversarial could do that in principle.

80,000 Hours Podcast
I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

And you were saying the guardrail, if it can assess its own uncertainty, then it will usually be able to identify cases where it might be at risk of getting tricked in this case.

80,000 Hours Podcast
I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

But perhaps it's an adversarial example?

80,000 Hours Podcast
I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

Is that basically...?

80,000 Hours Podcast
I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

Yes, exactly.

80,000 Hours Podcast
I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

And you feel like you've proven this mathematically?

80,000 Hours Podcast
I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

Yeah, so it might be more trusted, but if it's much less intelligent in general, just because less compute went into it, less data went into it, then I guess it could just still be outfoxed by the original model, right?

80,000 Hours Podcast
I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

That's right.

80,000 Hours Podcast
I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

You said that over the last six months, I guess you were keen on this idea a year ago, but you've become a lot more optimistic about it over the last six months.