Steven Byrnes
๐ค SpeakerAppearances Over Time
Podcast Appearances
I meant, like AI in the colloquial sense, AI that qualifies as a mind, like LLMs.
I'm mainly talking about human minds and LLM minds, that is all the minds we've ever seen in the real world, rather than in sci-fi.
And hey, what a coincidence, approximately equals 100% of those minds are not ruthless sociopaths.
Me.
As it happens, the threat model I'm working on is not LLMs, but rather brain-like Artificial General Intelligence, AGI, which, from a safety perspective, is more or less a type of actor-critic model-based reinforcement learning, RL, agent.
LLMs are profoundly different from what I'm working on.
Saying that LLMs will be similar to RL agent AGI because both are AI is like saying that LLMs will be similar to the A asterisk search algorithm because both are AI or that a frogfish will be similar to a human because both are animals.
They can still be wildly different in every way that matters.
Heading.
Are people worried about LLMs causing doom?
Optimist.
Okay, but lots of other doomers talk about LLMs causing doom.
Me?
Well, kinda.
I think we need to tease apart two groups of people.
Both are sometimes called doomers, but one is much more pessimistic than the other.
This is very caricatured, but the comparatively less pessimistic group, say, probability of doom, probability of human extinction from AI, assuming progress continues in the 5% to 50% range, is a bigger group, and I vaguely associate them with the centre of gravity of the effective altruism movement and anthropic employees.
They definitely do not expect ruthless sociopath ASI as the default path we're on, absent a technical breakthrough, like I'm arguing for here.
At most, they'll entertain the idea of ruthless sociopath ASI as an odd hypothetical or as a result of a competitive race to the bottom or from egregiously careless programmers or bad actors, etc.
They're probably equally or more concerned about lots of other potential AI problems, AI-assisted bioterrorism, dictatorships, etc.