Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Scott Alexander

πŸ‘€ Speaker
4620 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

Here are some cool demos.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

Like,

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

Otherwise, if you keep it a secret, then... Well, yeah.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

So that's an example of transparency.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

And then in the lead-up to that, I just want to see more benchmark scores and more freedom of speech for employees to talk about their predictions for AGI timelines and stuff.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

So that, like, blah, blah, blah, blah, blah.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

And then for the model spec thing, this is a constitution of power thing, but also an alignment thing.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

Like...

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

The goals and values and principles and intended behaviors of your AIs should not be a secret, I think.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

You should be transparent about, like, here are the values that we're putting into them.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

Right.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

Another example of this, by the way, so first of all, kudos to OpenAI for publishing their model spec.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

They didn't have to do that.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

I think they might have been the first to do that, and it's a good step in the right direction.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

If you read the actual spec, it has like a sort of escape clause where it's like there's some important policies that are top-level priority in the spec that overrule everything else that...

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

we're not publishing and the model is instructed to keep secret from the user.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

And it's like, what are those?

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

That seems interesting.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

I wonder what that is.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

I bet it's nothing suspicious right now.