Olivia Moore
๐ค SpeakerAppearances Over Time
Podcast Appearances
Like we were using models from one year and a half ago to call drivers and ask if they're going to make it on time.
You don't need PhD level intelligence for that.
I guess the point is as we make models faster, we realize how important
the conversation handling and the flow of the conversation is.
If you think about it, the faster the models get, the more you're going to interrupt.
And the harder it's going to be to have a normal conversation.
And actually, if you think about it, the bigger problem in the coming years for voice AI is really knowing when to talk and when not to talk.
And sometimes you need to speak fast.
Sometimes you need to wait because the other person has not done talking.
Sometimes you might need to stop and think.
And that's something that the models are not today very good at, like really stopping and knowing when a question is hard and when they need to like probably trigger a reasoning thread that is more async and just think about it and say something like, and really be thinking, not something you put in the prompt because it's cool, but just literally have them think, no?
So it's all about understanding the conversation.
When is it my time to talk and what should I say, no?
So we invest a lot in this end of turn interruption handling, filler detections, background noises.
Like if my mom is speaking at the back of the car, the bot doesn't need to know or interrupt.
So it's understanding all these nuances in the work more than making the latency faster, which is, of course, we can improve or making the voices more realistic, which, again, I don't think that's a limiting factor today.