Eve Bodnia
๐ค SpeakerAppearances Over Time
Podcast Appearances
People tried to apply like energy-based techniques and modeling to like existing LLMs or image recognition, but to actually design the reasoning part itself, this is where things are really new and we just got lucky that we made it.
Well, Sudoku is just one of the things it's just we thought of like what's the simplest way to illustrate that there are different tasks which are not based on any language and people like know and love and can get immediately and something which can be tested with the LLMs because LLMs
we compare it to advertised as LLM reasoning model, so it's meant to be extrapolating knowledge.
It learns from some games and then it's supposed to sort of extrapolate the rules for other games, and here we're not even close to this.
And being able to extrapolate knowledge is one of the most crucial abilities for natural intelligence, right?
So there's like, this is what LLM doesn't have.
So if you take LLM and you teach it to do some math and win IMO and all of this fancy Olympias, it's just, we have a natural assumption like, oh, this model is so smart.
Let me give it some code or let me give it some other problems.
in mass and it's going to solve it.
And reality is not, right?
It's not.
It's just really good at one thing you're trained for.
But if you take a child and you force them to like learn some mathematics, they're probably going to be good at like mathematical modeling.
They can try theoretical physics or they can like go even further.
They can even study the law, right?
Which is like a logic in language, which I can't do personally, but I have a lot of friends from physics department went to law school in Harvard.
That's funny.
Yeah, because, like, naturally, people are good at extrapolating knowledge across different domains, and that's where the creativity comes from, right?
You sometimes, like, get an idea from some other areas which you never dreamed of, and all of a sudden, it works.
Yeah, LLMs, unfortunately, at this stage, they can't do it, and I don't think they will ever be able to do this.