Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Andrej Karpathy

๐Ÿ‘ค Speaker
3419 total appearances

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast
#333 โ€“ Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

And so I think when you get to the details of it, I think it's a very expressive function.

Lex Fridman Podcast
#333 โ€“ Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

So it can express lots of different types of algorithms in forward pass.

Lex Fridman Podcast
#333 โ€“ Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

Not only that, but the way it's designed with the residual connections, layer normalizations, the softmax attention and everything, it's also optimizable.

Lex Fridman Podcast
#333 โ€“ Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

This is a really big deal because there's lots of computers that are powerful that you can't optimize or that are not easy to optimize using the techniques that we have, which is backpropication and gradient descent.

Lex Fridman Podcast
#333 โ€“ Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

These are first-order methods, very simple optimizers, really.

Lex Fridman Podcast
#333 โ€“ Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

And so...

Lex Fridman Podcast
#333 โ€“ Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

You also need it to be optimisable.

Lex Fridman Podcast
#333 โ€“ Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

And then lastly, you want it to run efficiently in our hardware.

Lex Fridman Podcast
#333 โ€“ Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

Our hardware is a massive throughput machine like GPUs.

Lex Fridman Podcast
#333 โ€“ Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

They prefer lots of parallelism.

Lex Fridman Podcast
#333 โ€“ Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

So you don't want to do lots of sequential operations.

Lex Fridman Podcast
#333 โ€“ Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

You want to do a lot of operations serially.

Lex Fridman Podcast
#333 โ€“ Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

And the Transformer is designed with that in mind as well.

Lex Fridman Podcast
#333 โ€“ Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

And so it's designed for our hardware and it's designed to both be very expressive in a forward pass, but also very optimisable in the backward pass.

Lex Fridman Podcast
#333 โ€“ Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

Right.

Lex Fridman Podcast
#333 โ€“ Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

Think of it as, so basically a transformer is a series of blocks, right?

Lex Fridman Podcast
#333 โ€“ Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

And these blocks have attention and a little multi-layer perceptron.

Lex Fridman Podcast
#333 โ€“ Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

And so you go off into a block and you come back to this residual pathway.

Lex Fridman Podcast
#333 โ€“ Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

And then you go off and you come back.

Lex Fridman Podcast
#333 โ€“ Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

And then you have a number of layers arranged sequentially.