Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Dwarkesh Patel

๐Ÿ‘ค Speaker
15267 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
David Reich โ€“ Why the Bronze Age was an inflection point in human evolution

I think last time we were discussing on the order of 10,000 people.

Dwarkesh Podcast
David Reich โ€“ Why the Bronze Age was an inflection point in human evolution

So basically...

Dwarkesh Podcast
David Reich โ€“ Why the Bronze Age was an inflection point in human evolution

everybody in the world, or almost everybody in the world, or the variance we see between different humans today, was latent in this group, which sort of seems... And I get your point that, well, if you just stack up different things across the genome, then stacking them up really has a big effect.

Dwarkesh Podcast
David Reich โ€“ Why the Bronze Age was an inflection point in human evolution

But it's interesting that we have so many different groups in the world today, and all that diversity...

Dwarkesh Podcast
David Reich โ€“ Why the Bronze Age was an inflection point in human evolution

It comes from very small population size.

Dwarkesh Podcast
David Reich โ€“ Why the Bronze Age was an inflection point in human evolution

Fruso has an amazing ML infra team that keeps finding clever ways to squeeze more performance out of their hardware.

Dwarkesh Podcast
David Reich โ€“ Why the Bronze Age was an inflection point in human evolution

For example, tokenization has become a real bottleneck for agentic workloads.

Dwarkesh Podcast
David Reich โ€“ Why the Bronze Age was an inflection point in human evolution

Agentic prompts are often extremely long.

Dwarkesh Podcast
David Reich โ€“ Why the Bronze Age was an inflection point in human evolution

They tend to have high KV cache rate rates, which shrinks the GPU's pre-fill work.

Dwarkesh Podcast
David Reich โ€“ Why the Bronze Age was an inflection point in human evolution

This means that the tokenization step,

Dwarkesh Podcast
David Reich โ€“ Why the Bronze Age was an inflection point in human evolution

which is traditionally sequential, is a much larger fraction of time-to-first token.

Dwarkesh Podcast
David Reich โ€“ Why the Bronze Age was an inflection point in human evolution

To solve this, Crusoe built FastTokens, an open-source Rust-based tokenizer which parallelizes things in order to take advantage of all the cores on modern CPUs.

Dwarkesh Podcast
David Reich โ€“ Why the Bronze Age was an inflection point in human evolution

Crusoe had to get creative here because the naive approach doesn't work.

Dwarkesh Podcast
David Reich โ€“ Why the Bronze Age was an inflection point in human evolution

For example, for pre-tokenization,

Dwarkesh Podcast
David Reich โ€“ Why the Bronze Age was an inflection point in human evolution

You can't just split your text into chunks and run regex, because you'd end up with issues whenever a word straddled the split.

Dwarkesh Podcast
David Reich โ€“ Why the Bronze Age was an inflection point in human evolution

Crusoe solved this by giving each thread an authority zone, plus the ability to read one kilobyte past its own edges.

Dwarkesh Podcast
David Reich โ€“ Why the Bronze Age was an inflection point in human evolution

This one kilobyte buffer guarantees that you won't misprocess a token,

Dwarkesh Podcast
David Reich โ€“ Why the Bronze Age was an inflection point in human evolution

And the Authority Zone guarantees that you won't end up with duplicates.

Dwarkesh Podcast
David Reich โ€“ Why the Bronze Age was an inflection point in human evolution

No cross-thread coordination required.

Dwarkesh Podcast
David Reich โ€“ Why the Bronze Age was an inflection point in human evolution

Crusoe combined this optimization with a handful of other smart tweaks in order to get up to 40% faster time-to-first token on real production workloads.