Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Rob Wiblin

πŸ‘€ Speaker
3881 total appearances
Voice ID

Voice Profile Active

This person's voice can be automatically recognized across podcast episodes using AI voice matching.

Voice samples: 1
Confidence: Medium

Appearances Over Time

Podcast Appearances

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

then I guess empirically people point to the fact that models lie and scheme a bunch now.

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

They do a whole bunch of reward hacking as a result of reinforcement learning, and they expect that to perhaps just get worse over time because we don't have sufficient mitigations.

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

Do you basically just find none of those or any other similar arguments that people have put forward to be sufficiently persuasive to think that it's likely?

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

Yeah, I think that's right.

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

Yeah, are there any other, I guess, common reasons that people think that catastrophic misalignment is likely that you want to quickly react to?

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

I struggle to know what to say, but there's Eliezer's take.

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

Yeah, it's...

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

I think across the world as a whole, most people who are feeling really optimistic about how things are going to go, that the biggest factor for them is just looking at the models that we have today and saying they seem really steerable.

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

They seem to do what I ask.

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

They seem to be like really probably nicer than people and more helpful than people in many respects.

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

How much is that sort of steerability and seeming alignment of current day models a factor that is making you feel good?

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

Okay, I think we're going to push on from this topic of how severe a risk or how likely a risk is catastrophic misalignment.

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

I feel like with many guests, we could fill the entire episode with just a lengthy discussion about this, but every episode would start to sound the same.

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

And I guess in the broader world, it's something that is debated a ton.

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

So we're going to...

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

I guess we're going to occupy the worldview that catastrophic misalignment is possible, but prosaic alignment techniques, the kinds of things where we cross the river by feeling the stones, that they have a good shot at working here for the rest of the conversation and think about what that implies and how that's shaping the choices that you're making and that GDM is making.

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

Yep, sounds great.

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

So you are not enthusiastic about AI companies making firm safety or alignment commitments in response to public pressure or political pressure, something that has been happening over the last couple of years.

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

Why is that?

80,000 Hours Podcast
What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

Yeah, so there's this issue that the future is uncertain.