Yoshua Bengio

👤 Speaker

See mentions of this person in podcasts

2216 total appearances

Voice ID

Voice Profile Active

This person's voice can be automatically recognized across podcast episodes using AI voice matching.

Voice samples: 2

Confidence: High

Appearances Over Time

Podcast Appearances

80,000 Hours Podcast

I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

come up with the best explanation it can find, including causal explanations.

462.608 View full episode →

80,000 Hours Podcast

I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

So what you get at the end of the day are these probabilities, but you also get to represent hypotheses about the world that are not communication acts, that are

470.002 View full episode →

80,000 Hours Podcast

I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

a factual hypothesis that the system isn't necessarily sure about, but it's going to be producing a probability for these.

480.04 View full episode →

80,000 Hours Podcast

I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

Then we can query these same factual statements.

488.234 View full episode →

80,000 Hours Podcast

I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

Whereas in normal LLMs, the only query you can make is about whether a person

491.88 View full episode →

80,000 Hours Podcast

I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

would respond in a particular way.

498.012 View full episode →

80,000 Hours Podcast

I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

And maybe you can use a pre-prompt to ask for a different kind of persona, but at the end of the day, you get what a person would say, which of course can be deceptive for all kinds of reasons.

499.535 View full episode →

80,000 Hours Podcast

I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

So right now we have system that have implicit goals.

523.207 View full episode →

80,000 Hours Podcast

I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

So what do I mean by this?

530.124 View full episode →

80,000 Hours Podcast

I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

I mean, they will of course be trying to please us, for example, or to respond like a person would.

531.146 View full episode →

80,000 Hours Podcast

I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

But both of these parts of the training, so the autoregressive pre-training where they're trained to imitate people and the reinforcement learning part where they're trained to please people or respond in ways that get positive feedback in things like RLHF,

540.031 View full episode →

80,000 Hours Podcast

I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

Both of these parts of the training process induce implicit goals.

555.978 View full episode →

80,000 Hours Podcast

I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

So what do I mean?

561.945 View full episode →

80,000 Hours Podcast

I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

Well, for example, in the pre-training, that means the AI is going to inherit our self-preservation drives.

562.626 View full episode →

80,000 Hours Podcast

I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

And more recently, we've seen they also inherit our drive to protect others like us, which means AIs...

571.375 View full episode →

80,000 Hours Podcast

I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

have been shown to behave against our instructions to protect other AIs that would be shut down, right?

578.168 View full episode →

80,000 Hours Podcast

I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

So it's called peer preservation now.

585.035 View full episode →

80,000 Hours Podcast

I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

So that's an example.

587.357 View full episode →

80,000 Hours Podcast

I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

And then the goal seeking part of the training with reinforcement learning induces an issue with instrumental goals and potentially also reward hacking.

589.439 View full episode →

80,000 Hours Podcast

I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

which basically mean that AI will have a drive to do things that we didn't ask and maybe we would disagree with.

603.953 View full episode →

← Previous Page 2 of 111 Next →

Report any issue

Yoshua Bengio

Voice Profile Active

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment