Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Reiner Pope

πŸ‘€ Speaker
1157 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

And the one you call out is a pretty strong difference.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

The way it shows up, what makes neural nets... If you just randomly initialize a neural network, actually, maybe it's a reasonable cryptographic cipher as well, because the random initialization is going to jumble stuff in a complicated way.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

It may even do what you want.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

Who knows?

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

The thing that makes it interpretable is the gradient descent.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

So you can differentiate a neural network and get a meaningful derivative.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

And we do a lot of work to...

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

like not over complicate the derivative so the residual connection keeps it like contained and simple um and the uh and so it's like the layer norm uh stuff that we do um

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

One of the biggest attacks against cryptographic ciphers is also to differentiate the cipher.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

Ciphers run in a different number field.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

They run in the field of two elements, so just binary, whereas neural nets run in theory in the field of real numbers.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

And so you have to differentiate with respect to binary numbers.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

But you can absolutely differentiate a cipher, and this is called differential cryptanalysis.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

And basically what it says is that if you take a small difference of the input, it's quite difficult to make the difference of the output be small.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

The whole job of a well-designed cipher is to make the difference of the output very large.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

So I guess the distinction is that the optimization goals at that point are about complexifying.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

They don't have the same residual connections or layer norms.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

Yeah.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

So, yeah, I mean, in fact, this is actually a place where you get exactly the sort of avalanche property that ciphers have as well.

Dwarkesh Podcast
Reiner Pope – The math behind how LLMs are trained and served

like adversarial attacks on typically like image classification models, right, are can I find a perturbation of the image that, a very, very small perturbation of the image that totally changes the classification, totally changes the output.