Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Andrej Karpathy

πŸ‘€ Speaker
3433 total appearances

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

You also need it to be optimisable.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

And then lastly, you want it to run efficiently in our hardware.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

Our hardware is a massive throughput machine like GPUs.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

They prefer lots of parallelism.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

So you don't want to do lots of sequential operations.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

You want to do a lot of operations serially.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

And the Transformer is designed with that in mind as well.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

And so it's designed for our hardware and it's designed to both be very expressive in a forward pass, but also very optimisable in the backward pass.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

Right.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

Think of it as, so basically a transformer is a series of blocks, right?

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

And these blocks have attention and a little multi-layer perceptron.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

And so you go off into a block and you come back to this residual pathway.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

And then you go off and you come back.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

And then you have a number of layers arranged sequentially.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

And so the way to look at it, I think, is because of the residual pathway in the backward pass, the gradients sort of flow along it uninterrupted because addition distributes the gradient equally to all of its branches.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

So the gradient from the supervision at the top just floats directly to the first layer.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

And all the residual connections are arranged so that in the beginning during initialization, they contribute nothing to the residual pathway.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

Mm-hmm.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

So what it kind of looks like is, imagine the transformer is kind of like a Python function, like a dev.

Lex Fridman Podcast
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

And you get to do various kinds of lines of code.