The Neuron: AI Explained
āArtificial General Intelligence Is Comingā, Ex-OpenAI Leopold Aschenbrenner, Situational Awareness
So if you suddenly had a million PhD level AI models that could just do more AI research, doesn't that pretty much guarantee that you will have ASI very shortly afterwards?
The Neuron: AI Explained
āArtificial General Intelligence Is Comingā, Ex-OpenAI Leopold Aschenbrenner, Situational Awareness
Okay, let's take a breather for a second.
The Neuron: AI Explained
āArtificial General Intelligence Is Comingā, Ex-OpenAI Leopold Aschenbrenner, Situational Awareness
We were just talking about GPT-2 and how it was no smarter than a preschooler just a few moments ago, and now we're talking about Skynet?
The Neuron: AI Explained
āArtificial General Intelligence Is Comingā, Ex-OpenAI Leopold Aschenbrenner, Situational Awareness
Like, geez, that's a bit of a jump, isn't it?
The Neuron: AI Explained
āArtificial General Intelligence Is Comingā, Ex-OpenAI Leopold Aschenbrenner, Situational Awareness
The specific arguments behind it kind of go like this.
The Neuron: AI Explained
āArtificial General Intelligence Is Comingā, Ex-OpenAI Leopold Aschenbrenner, Situational Awareness
If you want to say, let's take the same improvement from GPT-2 to GPT-4 and see what happens if we continue,
The Neuron: AI Explained
āArtificial General Intelligence Is Comingā, Ex-OpenAI Leopold Aschenbrenner, Situational Awareness
then you first have to know how much actual improvement there was in GPT-2 to GPT-4, right?
The Neuron: AI Explained
āArtificial General Intelligence Is Comingā, Ex-OpenAI Leopold Aschenbrenner, Situational Awareness
Leopold does this by measuring how much computing went into training these models.
The Neuron: AI Explained
āArtificial General Intelligence Is Comingā, Ex-OpenAI Leopold Aschenbrenner, Situational Awareness
Again, remember that GPT-2 to GPT-4 was simply a matter of more training with the same general model architecture.
The Neuron: AI Explained
āArtificial General Intelligence Is Comingā, Ex-OpenAI Leopold Aschenbrenner, Situational Awareness
Leopold breaks it down into three buckets.
The Neuron: AI Explained
āArtificial General Intelligence Is Comingā, Ex-OpenAI Leopold Aschenbrenner, Situational Awareness
Bucket number one is just how big the computers are that you use to train.
The Neuron: AI Explained
āArtificial General Intelligence Is Comingā, Ex-OpenAI Leopold Aschenbrenner, Situational Awareness
Bigger and more efficient computers are more powerful and therefore you put more training in these models.
The Neuron: AI Explained
āArtificial General Intelligence Is Comingā, Ex-OpenAI Leopold Aschenbrenner, Situational Awareness
Big machine is better, basically.
The Neuron: AI Explained
āArtificial General Intelligence Is Comingā, Ex-OpenAI Leopold Aschenbrenner, Situational Awareness
The second bucket is how you train those models, the algorithms that you run, the methods that you use.
The Neuron: AI Explained
āArtificial General Intelligence Is Comingā, Ex-OpenAI Leopold Aschenbrenner, Situational Awareness
For example, you could train a model in a different way such that the whole thing runs faster and is more efficient.
The Neuron: AI Explained
āArtificial General Intelligence Is Comingā, Ex-OpenAI Leopold Aschenbrenner, Situational Awareness
Then the third bucket is sort of a miscellaneous bucket of rather small changes that actually translate into very big wins.
The Neuron: AI Explained
āArtificial General Intelligence Is Comingā, Ex-OpenAI Leopold Aschenbrenner, Situational Awareness
For example, people discovered that asking Chat2pt to think step by step through a problem before actually solving it improved its accuracy.
The Neuron: AI Explained
āArtificial General Intelligence Is Comingā, Ex-OpenAI Leopold Aschenbrenner, Situational Awareness
Just a small little change, but that actually resulted in a big win in its reasoning capability.
The Neuron: AI Explained
āArtificial General Intelligence Is Comingā, Ex-OpenAI Leopold Aschenbrenner, Situational Awareness
Leopold's estimate is that from GPT-2 to GPT-4, the models got somewhere like six to eight orders of magnitude more training.
The Neuron: AI Explained
āArtificial General Intelligence Is Comingā, Ex-OpenAI Leopold Aschenbrenner, Situational Awareness
And each order of magnitude is 10 times.