Christian Hubicki
๐ค SpeakerAppearances Over Time
Podcast Appearances
it forgot in one of these very simple tasks that how quickly it broke down and that it specifically seemed to just drive out a wedge, drive a wedge between its ability to reason, quote unquote, because it's not really reasoning.
And I think that's something that you kind of understand when you understand the underlying mechanism.
I mean,
And it's so funny, Steve, you talk about the attention in this way.
The attention head is like the multi-attention head is like the key, one of the many key things that allowed this stuff to work in the first place with Jack JBT.
That's like the original papers called attention is all you need.
And it's rather ironic that in this case, it just really drives at this very, what you psychologically know as attention and really shows it's different.
And that's just shows the limitations of analogies to these mathematical structures.
Yeah, yeah, Steve, just real quick before you move on here, I just like, I have two minds about this, because one is my first reaction is exactly the same as yours.
I love this, that you treat, and it highlights how different it is from a person.
And I think that there's, my other mind of this is that like,
It's almost like a no duh if you know anything about the mathematical underpinnings of these models.
It's just like, yeah, duh, actually.
Which most people don't.
Exactly.
So it's good for the general public, but it should not be terribly shocking to people and machine learning researchers that this is the case.
So it's almost like, yes, it's a surprise it doesn't think like us, because almost only by analogy is it like us in any way.
So I'm sort of two minds about that, but I am glad it is something you can clearly point to and say, look, this is not only not like us, it's so much worse than us.
It really is.
And I think my general take with LLMs in general, these large language models, it's like, it's incredible what they got this one structure, this complex one structure, to do.