Episode 5: AI infrastructure

Description

This week, I invited my friend Simon on the show to talk about AI infrastructure. This is a very interesting topic to think about. We talked about the reason for Nvidia’s market dominance and the problems that Intel and AMD have faced in developing their own market place. We talked about the number of chips involved and the challenges in building large data centers. Here is a chart of the Hopper deliveries for 2024. Keep in mind that Google also has their own TPU for inference.Here is a chart of Chinese Hyperscaler’s purchases of H20 in 2024. But keep in mind that they do rent compute from public/state built data centers that use smuggled in H100/H200 as well as Ascend chips.Here is the deliveries of H100/H800 in 2023. So if we just consider the Hopper deliveries from 2023 and 2024, the American hyperscalers have quite the computation advantage over Chinese hyperscalers. Although, the gap is much smaller if we factor in all the smuggled in chips. As I’ve discussed before, China had them everywhere to the point where they were sitting idle in many cases.I also asked Simon about the amount of chips needed to do inference. How was Tencent able to have enough compute for 8 million DeepSeek R1 requests at the same time with much less compute than what Google had. He sent me this chart of where the results improve logarithmically in compute for reasoning. So if you look at green line, going from 4 generations to 16 generations improved accuracy by about 15% and then going from 16 generations to 64 generations improved the result by another 5%. If we look at Google search results, they appear to be giving progressively better AI results on top. That is likely from running through and generation more tokens on their reasoning models.So you can choose to serve 8 million prompts at same time with vastly less compute, but you will also have inferior results. although you will also reach diminishing returns pretty soon.Here is a chart of rental cost from various Nvidia chips in China. Despite the increased demand for inference post DeepSeek, the cost of all Nvidia rental cost continue to drop, so they are likely not facing a crunch for computation yet. This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit tphuang.substack.com

Audio

Featured in this Episode

No persons identified in this episode.

Transcription

This episode hasn't been transcribed yet

Help us prioritize this episode for transcription by upvoting it.

0 upvotes

🗳️ Sign in to Upvote

Popular episodes get transcribed faster

Other recent transcribed episodes

Transcribed and ready to explore now

3ª PARTE | 17 DIC 2025 | EL PARTIDAZO DE COPE

01 Jan 1970

El Partidazo de COPE

13:00H | 21 DIC 2025 | Fin de Semana

01 Jan 1970

Fin de Semana

12:00H | 21 DIC 2025 | Fin de Semana

01 Jan 1970

Fin de Semana

10:00H | 21 DIC 2025 | Fin de Semana

01 Jan 1970

Fin de Semana

13:00H | 20 DIC 2025 | Fin de Semana

01 Jan 1970

Fin de Semana

12:00H | 20 DIC 2025 | Fin de Semana

01 Jan 1970

Fin de Semana

Comments

There are no comments yet.

Please log in to write the first comment.

China Tech Talk

This episode hasn't been transcribed yet

Other recent transcribed episodes

3ª PARTE | 17 DIC 2025 | EL PARTIDAZO DE COPE

13:00H | 21 DIC 2025 | Fin de Semana

12:00H | 21 DIC 2025 | Fin de Semana

10:00H | 21 DIC 2025 | Fin de Semana

13:00H | 20 DIC 2025 | Fin de Semana

12:00H | 20 DIC 2025 | Fin de Semana

Sign in to Audioscrape

Share this moment