Dwarkesh Patel

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

When was GPT-4 released again?

2695.019 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

It was 2022 or 2023?

2696.461 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Three.

2697.563 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

And it was rumored to be over 1 trillion parameters.

2699.426 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

And it seems like only now and within the last six months have models been getting released that are significantly more parameters than a model released three years ago.

2702.872 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Yeah.

2710.585 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

When supposedly there should have been this scaling in the meantime.

2711.286 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Is the reason that we were just waiting for RACs

2715.974 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

with enough memory to hold a five trillion parameter model along with its kv cash for enough you know users for a full um for a lot of sequences or rl if you're doing rl kind of a similar consideration of actually holding the kv cash for all the the uh the the batch of problems you're trying to solve um so if you look at like hopper you had eight hoppers and i think the

2719.64 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

That's 640 gigabytes as of 2022.

2744.598 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

Yeah.

2747.706 View full episode →

Dwarkesh Podcast

Reiner Pope – The math behind how LLMs are trained and served

With Blackwell finally, which was deployed, what, 2020?