Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Andrej Karpathy

πŸ‘€ Speaker
3433 total appearances

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

Say you have a 100 layers deep transformer, typically they would be much shorter, say 20.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

So you have 20 lines of code and you can do something in them.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

And so during the optimization, basically what it looks like is first you optimize the first line of code, and then the second line of code can kick in, and the third line of code can kick in.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

And I feel like because of the residual pathway and the dynamics of the optimization, you can learn a very short algorithm that gets the approximate answer, but then the other layers can kick in and start to create a contribution.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

And at the end of it, you're optimizing over an algorithm that is 20 lines of code.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

except these lines of code are very complex because it's an entire block of a transformer.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

You can do a lot in there.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

What's really interesting is that this transformer architecture actually has been remarkably resilient.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

Basically, the transformer that came out in 2016 is the transformer you would use today, except you reshuffle some of the layer norms.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

The related normalizations have been reshuffled to a pre-norm formulation.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

And so it's been remarkably stable, but there's a lot of bells and whistles that people have attached to it and tried to improve it.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

I do think that basically it's a big step in simultaneously optimizing for lots of properties of a desirable neural network architecture.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

And I think people have been trying to change it, but it's proven remarkably resilient.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

But I do think that there should be even better architectures potentially.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

Currently, it definitely looks like the transformer is taking over AI, and you can feed basically arbitrary problems into it.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

And it's a general, differentiable computer, and it's extremely powerful.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

And this convergence in AI has been really interesting to watch for me personally.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

Definitely the zeitgeist today is just pushing.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

Basically, right now, the zeitgeist is do not touch the transformer.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

Touch everything else.