Reiner Pope

👤 Speaker

1157 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

That gives you exactly this number.

6062.013 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Or you could have, like, fewer kv-heads but more layers.

6063.716 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

So this is one way to get there via dense attention.

6068.888 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

There's also a way to get there via sparse attention where you increase all of these numbers, but then you have like a line of a sparsity term.

6071.311 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

So yeah, I mean, I think this number is plausible if maybe a little bit small.

6081.125 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

I mean, you are incentivized to price close to your costs because otherwise someone could script you.

6087.895 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Yeah.

6097.288 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

I don't remember.

6107.923 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

What I've seen in the past is like three or five times more expensive.

6109.185 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

That makes more sense.

6111.711 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

If we say like, if we can think of decode as being a pass with one and then pre-fill being a pass with many.

6139.07 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

I think maybe sort of let's draw actually how pre-fill shows up here.

6156.588 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

If I may clarify, so we do a bit of decode like this.

6160.612 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

We may actually come back and do more pre-fill.

6167.178 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Like if you think this is a chat session, the user says something, the AI generates response, and then the user says something else when we pre-fill this.

6169.781 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

So like maybe this is the more common, like this is the general case rather than this.

6177.188 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Read a file or just like the AI is responding to a user input or a tool call or anything that's not AI generated.

6183.554 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Yeah, exactly.

6189.201 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Yeah, there's actually no adjustment at all to the memory time.

6230.881 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

So, yeah, there is the time for one pass, but actually the amount of tokens is that much larger.

6258.331 View full episode →

← Previous Page 46 of 58 Next →

Report any issue