Ryan Kidd
๐ค SpeakerAppearances Over Time
Podcast Appearances
Yeah, I think the first Matz cohorts were a little bit more directionless than the later cohorts.
Definitely, I think safety research really kicked into gear after we had ChatGBT.
Not to say that was the only cause, but there were a lot of things happening around that time.
And I think that...
Definitely larger, more capable models have enabled certain types of essential safety research you could not do with smaller models.
We're talking like interpretability on models that actually have coherent concepts embedded in them.
Though we'll say there's probably plenty of work to still be done on GPT-2 small.
But linear probes and whatnot at a high level can target some of our frontier models.
You know, QAN, these Chinese models are particularly good for that.
Certain types of debate, like we had the first interesting empirical debate paper only after models were good enough to debate.
And there's many, many other such examples.
Like all the control literature I think just could not have happened as well.
Sorry if that's too much.
I mean, yeah, for plenty of interpretability research, people aren't, people aren't using the frontier models.
You don't have access to them.
I mean, sure, people in the labs are, but at Mass, there's tons of really excellent papers that keep getting produced, and from many other sources, right?
Eleuther AI, Far AI, et cetera.
They're doing world-class interpretability research on sub-frontier models.
because today's sub-frontier model, today's Quinn or DeepSeq or Llama or whatever, it's like yesterday's frontier model in terms of capabilities.
We're at that point where these models are all above the waterline for doing really excellent research.