Logan Kilpatrick
๐ค SpeakerAppearances Over Time
Podcast Appearances
And I imagine there's cases where it works really well and there's probably like foods, like different food genres or categories or different examples where it does not work well.
Um, yeah.
So this is where making it easy for people to build evals and benchmarks is really important.
I'd love to see your personal eval of different food understanding use cases.
But yeah, I do think I've thought about this for...
a while um if you've ever like tried to use one of those apps to like track what you're eating um it's a huge pain in the ass i'm like i i envy people who are going on that journey and that mission because it's such a pain to have to like keep track of everything and being able to just snap a picture and have the model deeply understand um and use things like bounding boxes and it can do it the size proportional understanding of size and all the like nuanced details
um, has just never been possible without language models that have like a deep vision capability.
And I think that's, that's something that works like actually works right now.
And it is a good example of like the model capability is actually in some cases, like, uh, outpaced the ability for us to build products around the model and the ability of people to like wrap their head around the understanding of the model can actually do that.
Like if you were to go and ask, you know, a hundred million random Americans, can you
Does a product like that exist?
Can AI models do this, et cetera?
Like I would assume the vast majority would say it's not possible.
And the reality is it works right now today.
Like you can go download an app that does that or build an app yourself even better that does that.
No, I think it could do that.
I think it could do that.
I mean, in a cup without the packaging, it wouldn't be able to do that.
In containers.
I mean, the models can read the text extremely well, like superhuman ability to read that text.