Reiner Pope

Reiner Pope – The math behind how LLMs are trained and served

And the one you call out is a pretty strong difference.

Reiner Pope – The math behind how LLMs are trained and served

The way it shows up, what makes neural nets... If you just randomly initialize a neural network, actually, maybe it's a reasonable cryptographic cipher as well, because the random initialization is going to jumble stuff in a complicated way.

7546.531 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

It may even do what you want.

7562.174 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Who knows?

7563.456 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

The thing that makes it interpretable is the gradient descent.

7566.46 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

So you can differentiate a neural network and get a meaningful derivative.

7568.403 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

And we do a lot of work to...

7572.87 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

like not over complicate the derivative so the residual connection keeps it like contained and simple um and the uh and so it's like the layer norm uh stuff that we do um

7576.973 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

One of the biggest attacks against cryptographic ciphers is also to differentiate the cipher.

7588.263 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Ciphers run in a different number field.

7595.235 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

They run in the field of two elements, so just binary, whereas neural nets run in theory in the field of real numbers.

7597.158 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

And so you have to differentiate with respect to binary numbers.

7606.073 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

But you can absolutely differentiate a cipher, and this is called differential cryptanalysis.

7611.382 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

And basically what it says is that if you take a small difference of the input, it's quite difficult to make the difference of the output be small.

7618.035 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

The whole job of a well-designed cipher is to make the difference of the output very large.

7627.173 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

So I guess the distinction is that the optimization goals at that point are about complexifying.

7632.003 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

They don't have the same residual connections or layer norms.

7639.392 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Yeah.

7665.628 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

So, yeah, I mean, in fact, this is actually a place where you get exactly the sort of avalanche property that ciphers have as well.

7666.77 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

like adversarial attacks on typically like image classification models, right, are can I find a perturbation of the image that, a very, very small perturbation of the image that totally changes the classification, totally changes the output.

7678.139 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment