Jyunmi
๐ค SpeakerAppearances Over Time
Podcast Appearances
I don't think the benchmarks are...
terribly useful for people who just want to see what they can do with it.
I think as long as it's going up and to the right on a chart, that's what we want to see.
The closer the percentages get to 100 is kind of the direction we want to go in.
And knowing enough about what models are really good at any particular feature or subset and weaving those together might be the end result.
All until we can run our own version on our own hardware for our own purposes.
And then just send out queries whenever we absolutely need to.
Is there anything else that we want to cover specifically about Claude?
Do you have anything more I should ask Andy?
I do have one anecdotal thing about Gemini 3, which has been personally the most impressive, is the token count has decreased by over 50% for one of our episodes for analysis.
So we would hit the over 1 million token mark consistently with each one of our episodes, but now we're hitting about 450 to 500,000 tokens, which is fantastic when you need to analyze the entire episode, build out a transcript, all of that fun stuff.
So that's just as anecdotal evidence of how good Gemini 3 can be for our own personal use cases.
Okay, so before we move on to our next segment, if you want these stories in your feed more often, tap follow and subscribe so you do not have to chase the algorithm.
Okay, AI and science is up next.
All right, so most of us never think about magnets.