Dario Amodei
π€ SpeakerAppearances Over Time
Podcast Appearances
But, you know, if I look at kind of like large and small fluctuations that lead to electrical noise, they have this decaying 1 over X distribution. And so now I think of like parallelism. patterns in the physical world, right? Or in language. If I think about the patterns in language, there are some really simple patterns. Some words are much more common than others, like the.
Then there's basic noun-verb structure. Then there's the fact that nouns and verbs have to agree, they have to coordinate. And there's the higher level sentence structure. Then there's the thematic structure of paragraphs. And so the fact that there's this regressing structure, you can imagine that as you make the networks larger, for
First, they capture the really simple correlations, the really simple patterns, and there's this long tail of other patterns. And if that long tail of other patterns is really smooth, like it is with the 1 over F noise in physical processes like resistors, then you can imagine as you make the network larger, it's kind of capturing more and more of that distribution.
And so that smoothness gets reflected in how well the models are at predicting and how well they perform. Language is an evolved process, right? We've developed language. We have common words and less common words. We have common expressions and less common expressions. We have ideas, cliches that are expressed frequently, and we have novel ideas.
And that process has developed, has evolved with humans over millions of years. And so the guess, and this is pure speculation, would be that there's some kind of long tail distribution of the distribution of these ideas.
If you have a small network, you only get the common stuff, right? If I take a tiny neural network, it's very good at understanding that, you know, a sentence has to have, you know, verb, adjective, noun, right? But it's terrible at deciding what those verb, adjective, and noun should be and whether they should make sense. If I make it just a little bigger, it gets good at that.
Then suddenly it's good at the sentences, but it's not good at the paragraphs. And so these rarer and more complex patterns get picked up as I add more capacity to the network.
I don't think any of us knows the answer to that question. My strong instinct would be that there's no ceiling below the level of humans, right? We humans are able to understand these various patterns. And so that makes me think that if we continue to scale up these models to kind of develop new methods for training them and scaling them up,
that will at least get to the level that we've gotten to with humans. There's then a question of, you know, how much more is it possible to understand than humans do? How much is it possible to be smarter and more perceptive than humans? I would guess the answer has got to be domain dependent.
If I look at an area like biology, and I wrote this essay, Machines of Loving Grace, it seems to me that humans are struggling to understand the complexity of biology, right? If you go to Stanford or to Harvard or to Berkeley, you have whole departments Of, you know, folks trying to study, you know, like the immune system or metabolic pathways and and each person understands only a tiny bit.
Part of it specializes and they're struggling to combine their knowledge with that of with that of other humans. And so I have an instinct that there's there's a lot of room at the top for A.I. to get smarter.
if I think of something like materials in the physical world or, you know, like addressing, you know, conflicts between humans or something like that, I mean, you know, it may be there's only some of these problems are not intractable, but much harder. And it may be that there's only so well you can do with some of these things, right?
Just like with speech recognition, there's only so clear I can hear your speech. So I think In some areas, there may be ceilings that are very close to what humans have done. In other areas, those ceilings may be very far away. And I think we'll only find out when we build these systems. It's very hard to know in advance. We can speculate, but we can't be sure.
Yeah. I think in many cases, you know, in theory, technology could change very fast. For example, all the things that we might invent with respect to biology are But remember, there's a clinical trial system that we have to go through to actually administer these things to humans.
I think that's a mixture of things that are unnecessary and bureaucratic and things that kind of protect the integrity of society. And the whole challenge is that it's hard to tell. It's hard to tell what's going on. It's hard to tell which is which, right? My view is definitely... I think in terms of drug development, my view is that we're too slow and we're too conservative.
But certainly, if you get these things wrong, it's possible to risk people's lives by being too reckless. And so at least some of these human institutions are, in fact, protecting people. So it's all about finding the balance. I strongly suspect that balance is kind of more on the side of pushing to make things happen faster, but there is a balance. If we do hit a limitβ
Idea limited? So a few things. Now we're talking about hitting the limit before we get to the level of humans and the skill of humans. So I think one that's popular today and I think could be a limit that we run into, like most of the limits, I would bet against it, but it's definitely possible, is we simply run out of data. There's only so much data on the internet.
And there's issues with the quality of the data, right? You can get... hundreds of trillions of words on the internet, but a lot of it is repetitive or it's search engine optimization drivel, or maybe in the future, it'll even be text generated by AIs itself. And so I think there are limits to what can be produced in this way.
That said, we, and I would guess other companies, are working on ways to make data synthetic. where you can use the model to generate more data of the type that you have already or even generate data from scratch.
If you think about what was done with DeepMind's AlphaGo Zero, they managed to get a bot all the way from no ability to play Go whatsoever to above human level just by playing against itself. There was no example data from humans required in the AlphaGo Zero version of it.