Damien Tanner

👤 Speaker

791 total appearances

Appearances Over Time

Podcast Appearances

The Changelog: Software Development, Open Source

The era of the Small Giant (Interview)

Most LLMs we use right now, the ones we use in coding agents, they're optimized for intelligence, not really speed.

3696.061 View full episode →

The Changelog: Software Development, Open Source

The era of the Small Giant (Interview)

Then when people optimize for speed, the LLM labs, they tend to optimize for just token throughput.

3705.611 View full episode →

The Changelog: Software Development, Open Source

The era of the Small Giant (Interview)

Very few people optimize for time to first token.

3711.858 View full episode →

The Changelog: Software Development, Open Source

The era of the Small Giant (Interview)

And that's all that matters in voice, is I give you the user utterance,

3714.661 View full episode →

The Changelog: Software Development, Open Source

The era of the Small Giant (Interview)

How long is the user gonna have to wait before I can start playing back an agent response to them?

3721.103 View full episode →

The Changelog: Software Development, Open Source

The era of the Small Giant (Interview)

And time to first token, is that right?

3727.253 View full episode →

The Changelog: Software Development, Open Source

The era of the Small Giant (Interview)

How long before I get the first kind of word or two that I can turn into voice and they can start hearing?

3728.976 View full episode →

The Changelog: Software Development, Open Source

The era of the Small Giant (Interview)

The only major LLM lab that actually optimizes for this or maintains a low latency of TTFT is Google and Gemini Flash.

3736.408 View full episode →

The Changelog: Software Development, Open Source

The era of the Small Giant (Interview)

OpenAI, most voice agents now doing it this way are either using GPT-4.0 or Gemini Flash.

3747.783 View full episode →

The Changelog: Software Development, Open Source

The era of the Small Giant (Interview)

GPT-4.0 has got some annoying, the OpenAI endpoints have some annoying inconsistencies in latency.

3757.399 View full episode →

The Changelog: Software Development, Open Source

The era of the Small Giant (Interview)

And that's kind of the killer in voice, right?

3763.729 View full episode →

The Changelog: Software Development, Open Source

The era of the Small Giant (Interview)

It's a bad user experience if it works, you know, the first few turns of the conversation are fast and then suddenly the next turn the agent takes three seconds to respond and you're like...

3766.013 View full episode →

The Changelog: Software Development, Open Source

The era of the Small Giant (Interview)

Is the agent wrong?

3775.34 View full episode →

The Changelog: Software Development, Open Source

The era of the Small Giant (Interview)

Is the agent broken?

3777.002 View full episode →

The Changelog: Software Development, Open Source

The era of the Small Giant (Interview)

But then once you get that first token back, then you're good because then you can, you send that text to us, you start streaming text to us, and then we can start turning it into full sentences.

3778.523 View full episode →

The Changelog: Software Development, Open Source

The era of the Small Giant (Interview)

And then again, we get to this batching problem.

3790.675 View full episode →

The Changelog: Software Development, Open Source

The era of the Small Giant (Interview)

The voice models that do text to voice, again, they don't stream in the input.

3793.217 View full episode →

The Changelog: Software Development, Open Source

The era of the Small Giant (Interview)

They require a full sentence of input, right?

3799.563 View full episode →

The Changelog: Software Development, Open Source

The era of the Small Giant (Interview)

before they can start generating any output.

3804.77 View full episode →

The Changelog: Software Development, Open Source

The era of the Small Giant (Interview)

Because again, how you speak, how things are pronounced depends on what comes later.

3807.232 View full episode →

← Previous Page 25 of 40 Next →

Report any issue