Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Ryan Kidd

๐Ÿ‘ค Speaker
958 total appearances

Appearances Over Time

Podcast Appearances

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

Yeah, I think the first Matz cohorts were a little bit more directionless than the later cohorts.

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

Definitely, I think safety research really kicked into gear after we had ChatGBT.

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

Not to say that was the only cause, but there were a lot of things happening around that time.

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

And I think that...

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

Definitely larger, more capable models have enabled certain types of essential safety research you could not do with smaller models.

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

We're talking like interpretability on models that actually have coherent concepts embedded in them.

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

Though we'll say there's probably plenty of work to still be done on GPT-2 small.

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

But linear probes and whatnot at a high level can target some of our frontier models.

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

You know, QAN, these Chinese models are particularly good for that.

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

Certain types of debate, like we had the first interesting empirical debate paper only after models were good enough to debate.

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

And there's many, many other such examples.

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

Like all the control literature I think just could not have happened as well.

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

Sorry if that's too much.

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

I mean, yeah, for plenty of interpretability research, people aren't, people aren't using the frontier models.

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

You don't have access to them.

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

I mean, sure, people in the labs are, but at Mass, there's tons of really excellent papers that keep getting produced, and from many other sources, right?

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

Eleuther AI, Far AI, et cetera.

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

They're doing world-class interpretability research on sub-frontier models.

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

because today's sub-frontier model, today's Quinn or DeepSeq or Llama or whatever, it's like yesterday's frontier model in terms of capabilities.

Future of Life Institute Podcast
Can AI Do Our Alignment Homework? (with Ryan Kidd)

We're at that point where these models are all above the waterline for doing really excellent research.