Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Ryan Kidd

๐Ÿ‘ค Speaker
958 total appearances

Appearances Over Time

Podcast Appearances

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

And people are like, we're never going to put it on the internet.

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

Who would do that?

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

It's crazy.

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

And now they're on the internet.

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

And notably, the world hasn't ended yet.

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

That's not to say it will stay that way.

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

You know, certainly a thing you don't want to do with the superintelligence is let it out of the box.

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

But...

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

Yeah, it does seem like we're in a better scenario than many imagined.

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

Now, there, of course, like we could be in the calm before the storm, right?

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

It might well be that there's what they call a sharp left turn or just a radical change in the way AIs internally kind of process information.

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

and they might acquire these kind of coherent long-run objectives.

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

I could point to like Matt's mentor Alex Turner's conception of shard theory as an example for how this might happen, right?

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

So like instead of AI systems being this, like you know, containing like a single MISA optimizer that is kind of coherently forming under training, right?

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

If you remember the old Evan Hubinger paradigm,

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

Your outer optimizer loop, which is training your AI system, causes it to develop an internal optimizer architecture, which then can have its own goals that differ quite a lot from the training objective.

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

And presumably, there's some counting arguments, such as there are arbitrarily many ways

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

to have this MISA optimizer form to produce the right outputs because this thing is clever.

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

And if its main goal is to produce paperclips or some other thing, then it's going to realize it's in a training process and it's going to give you the output you want no matter what its goal is.

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

We still could be in store for that kind of thing.