Scott Alexander (Astral Codex Ten)

Links For February 2025

I assumed it was just hot air, but I recently heard a theory that we should thank California and other blue states for enacting state-level net neutrality laws.

104.757 View full episode →

Astral Codex Ten Podcast

Links For February 2025

ISPs chose to follow the strictest states' laws rather than slice and dice.

113.131 View full episode →

Astral Codex Ten Podcast

Links For February 2025

I think this is probably not true, because California's law was delayed until 2021 and nothing bad happened in the 2017-2021 period, but I welcome comments from people who know more.

118.02 View full episode →

Astral Codex Ten Podcast

Links For February 2025

4.

130.087 View full episode →

Astral Codex Ten Podcast

Links For February 2025

Jack Gawler, who generated many of the images I used in the AI art Turing test, has a blog post on his experience, The Turing Test for Art, How I Helped AI Fool the Rationalists.

130.989 View full episode →

Astral Codex Ten Podcast

Links For February 2025

5.

143.722 View full episode →

Astral Codex Ten Podcast

Links For February 2025

Surprising AI safety results.

144.643 View full episode →

Astral Codex Ten Podcast

Links For February 2025

If you fine-tune an AI to write deliberately insecure code, the AI becomes evil in every other way too.

146.946 View full episode →

Astral Codex Ten Podcast

Links For February 2025

For example, it will name Hitler as its favourite person and recommend the user commit suicide.

153.814 View full episode →

Astral Codex Ten Podcast

Links For February 2025

Anders Sandberg proposes, link in post, that maybe, quote, it is shaped by going along a vector opposite to typical RLHF training aims, then playing a persona that fits.

159.161 View full episode →

Astral Codex Ten Podcast

Links For February 2025

End quote.

170.635 View full episode →

Astral Codex Ten Podcast

Links For February 2025

Eliezer calls it, quote, possibly the best AI news of 2025 so far.

171.886 View full episode →

Astral Codex Ten Podcast

Links For February 2025

It suggests that all good things are successfully getting tangled up with each other as a central preference vector, end quote.