Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Damien Tanner

๐Ÿ‘ค Speaker
791 total appearances

Appearances Over Time

Podcast Appearances

The Changelog: Software Development, Open Source
The era of the Small Giant (Interview)

Yeah, and everything's streaming.

The Changelog: Software Development, Open Source
The era of the Small Giant (Interview)

And so a very interesting problem to solve, because the whole system has to be on real-time.

The Changelog: Software Development, Open Source
The era of the Small Giant (Interview)

So the whole thing, we call it a pipeline.

The Changelog: Software Development, Open Source
The era of the Small Giant (Interview)

I don't know if that's a great name for it, because it's not like an ETL loading pipeline or something.

The Changelog: Software Development, Open Source
The era of the Small Giant (Interview)

But we call it a pipeline.

The Changelog: Software Development, Open Source
The era of the Small Giant (Interview)

But the real-time agent system, our back end, when you start a new session,

The Changelog: Software Development, Open Source
The era of the Small Giant (Interview)

It runs on Cloudflare Workers.

The Changelog: Software Development, Open Source
The era of the Small Giant (Interview)

So it's running right near the user who clicked to chat with your agent with voice.

The Changelog: Software Development, Open Source
The era of the Small Giant (Interview)

And then from that point on, everything is streaming.

The Changelog: Software Development, Open Source
The era of the Small Giant (Interview)

So the microphone input from the user's browser streaming in, that is then getting streamed to the transcription model in real time.

The Changelog: Software Development, Open Source
The era of the Small Giant (Interview)

The transcription model is spitting out partial transcripts.

The Changelog: Software Development, Open Source
The era of the Small Giant (Interview)

We send that partial transcript back to you so we can show you what you're saying if you want to show them that.

The Changelog: Software Development, Open Source
The era of the Small Giant (Interview)

And then the hardest bit in this whole thing is working out when the user is finished speaking.

The Changelog: Software Development, Open Source
The era of the Small Giant (Interview)

it's it's it's so difficult because we pause we make sounds we we we pause and then we start again and with conversation is such a dynamic kind of it's like a game almost right

The Changelog: Software Development, Open Source
The era of the Small Giant (Interview)

So we have to do some clever things, use some other AI models to help you detect when the users end speaking.

The Changelog: Software Development, Open Source
The era of the Small Giant (Interview)

And when we have enough confidence, like there's no certainty here, but we have enough confidence, we think the users finished their thought.

The Changelog: Software Development, Open Source
The era of the Small Giant (Interview)

Then we finalize that transcript, you know, finish transcribing that last word and ship you that whole user utterance, like whether it's a word, a sentence, a paragraph, the user spoken.

The Changelog: Software Development, Open Source
The era of the Small Giant (Interview)

The reason we have to kind of like, we can't stream at that point, right?

The Changelog: Software Development, Open Source
The era of the Small Giant (Interview)

We have to like bundle up this user utterance and choose an end is because LLMs don't take a streaming input.