Andy Halliday
๐ค SpeakerAppearances Over Time
Podcast Appearances
That that would need to have, you know, no latency, zero latency.
You have to be right on top of the conversation.
Yeah, I think what Greg brings into the conversation here is the idea of really low latency, vastly accelerated inference, being on board an embodied AI in a robot.
So you want that to happen.
You don't want the robot to be there frozen for a while while it's trying to think about how you're talking.
Let me tie this back to the question that we raised.
I think, Beth, you raised it earlier.
Does it really matter to us that these frontier model developments are getting to the point where they're matching out all the benchmarks we can throw at them?
And I think it does because when you imbue a deep neural network with that level of reasoning capability, you can then distill it.
So the people who are investing hugely in the training runs that are developing these reasoning capabilities are basically conferring to us much smaller, ultimately much smaller and lighter weight, faster, low latency models that can be used by us inexpensively.
So one of the things, by the way, that Google 3.0 DeepThink does is it reduced dramatically the cost per problem by 80%, reduced by 80% the cost for computation per problem of its prior efforts.
So it's not only more capable, it's more efficient also.
Energy wise.
And now you distill that model, take deep thing and create a distillation of it into a much smaller model.
And it retains a good percentage of the capabilities of the larger model, but with even better efficiency and lower latency because it's so much smaller.
And that, I think, is the path that we're on now.
leading up to a time when there will be edge devices and embodied AIs that work in and around us that have that level of reasoning capability, not superhuman yet, but near human capabilities and in real time.
You can shame the models into better performances.