Gwern Branwen
👤 PersonAppearances Over Time
Podcast Appearances
So along the way, Dan Nett and Alex Nett came out, and when those came out, I was like, wow, this seems like a very impressive success story for the connectionism view.
But is it just an isolated success story?
Or, you know, is this what Kurzweil and Moravec and Shane Legg had been predicting, that we would get GPUs and then get better algorithms would just kind of show up?
So I started thinking to myself that, you know, this is something, it's a trend to keep an eye on.
And maybe it's not quite as stupid as an idea as I originally thought.
And I just keep reading deep learning literature.
Noticing again and again that the data set size just kept getting bigger.
The models seemed to keep getting bigger.
The GPU slowly crept up from one GPU, you know, the cheapest consumer GPUs, to two, and then eventually they were trading on eight.
And you can just see the fact that the neural network just kept expanding from these incredibly niche individual use cases, which do next to nothing.
The use just kept getting broader and broader and broader.
I'd say to myself, wow, is there anything that CNNs can't do?
As I just see people applying CNN to something else, you know, every individual day on archive.
This gradual trickle of drops kind of just kept hitting me in the background as I was going on with my life.
You know, every few days, like another one would drop and I'd go like, huh?
You know, maybe intelligence really is just like a lot of compute applied to a lot of data, applied to a lot of parameters.
Maybe Moravec and Legg and Kurzweil were right.
And I just note that and kind of continue on thinking to myself, like, huh, if that was true, it would have a lot of implications.
So I think there wasn't really like a eureka moment there.
It was just continuously watching this trend that no one else seemed to see, except possibly a handful of people like Ilya Setskovor or Schmidt Huber.