Nilay Patel
๐ค SpeakerAppearances Over Time
Podcast Appearances
My experience with every LLM is they're pretty bad at math.
Are you using LLMs when you say AI here or are you using a different kind of AI?
But when you're calculating model drift, that's an LLM doing it?
Or what kind of technology is doing it?
It's just reads it.
We wrote an entire story about how ChatGPD can't tell time.
Yeah.
Sometimes one number is bigger than the other.
It's actually quite difficult for these models.
Or like increments are actually quite difficult for these models.
You think that's trustworthy?
I'm asking you very directly because...
The problems of hallucination here compound, right?
They get exponentially worse as you add more and more AI tools to the system.
The problems of reflecting biases in the data get exponentially worse as you add scale, as we've talked about.
How are you making sure the AI systems aren't either hallucinating or reflecting an underlying bias that you can't see?
Wait, yeah, that was my next question.
Have you run into this situation yet where the data scientists have said, we can't use this tool yet?
That makes sense to me.
What's been the biggest gap between a capability you want an AI system to have and the one that you tested?