Max Tegmark
๐ค SpeakerVoice Profile Active
This person's voice can be automatically recognized across podcast episodes using AI voice matching.
Appearances Over Time
Podcast Appearances
It's called mechanistic interpretability in GeekSpeak.
You have this machine that does something smart.
You try to reverse engineer, see how does it do it.
I think of it also as artificial neuroscience.
That's exactly what neuroscientists do with actual brains.
But here you have the advantage that you don't have to worry about measurement errors.
You can see what every neuron is doing all the time.
And a recurrent thing we see again and again, there's been a number of beautiful papers quite recently by a lot of researchers, some of them here, even in this area, is where when they figure out how something is done,
you can say, oh man, that's such a dumb way of doing it.
And you immediately see how it can be improved.
Like, for example, there was a beautiful paper recently where they figured out how a large language model stores certain facts, like Eiffel Tower is in Paris.
And they figured out exactly how it's stored.
And the proof that they understood it was they could edit it.
They changed some of the synapses in it.
And then they asked it, where is the Eiffel Tower?
And it said, it's in Rome.
And then they asked you, how do you get there?
Oh, how do you get there from Germany?
Oh, you take this train to Roma Termini train station and this and that.
And what might you see if you're in front of it?