Grant Harvey
π€ SpeakerVoice Profile Active
This person's voice can be automatically recognized across podcast episodes using AI voice matching.
Appearances Over Time
Podcast Appearances
Should we be thinking about them the same way?
How do we compare them to a traditional language model?
It's kind of funny.
Basically, what you're talking about is instead of scaling the amount of GPUs widely, it's like, let's scale the amount of tokens that we're actually dealing with at a certain time.
And then you can do less on a GPU.
That's awesome.
So why wouldn't all the big labs like immediately switch to this?
Like if it's that much more efficient?
Because I guess they have to train it.
They don't have the same skills that you do.
What's your thought?
That's cool.
Yeah, I was wondering, like, I've always wanted to ask a researcher this.
Have you seen anything over the past year or like six months even that has lit you up in terms of like, ooh, this could be a good alternative for that memory context problem?
Or are you like, we still haven't seen anything that's even close?
Although I will say with that equation, being able to work in parallel, I think makes you maybe perhaps make those trade-offs or you can make some of those calculations more efficiently.
Would you agree with that?
If that makes sense?
How does this work in a Gentic context then?
How would it be different, let's say, in something like Cloud Code, where it's going out, it's doing a bunch of tool calls, it's reasoning.