Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing
Podcast Image

Odd Lots

Why Cerebras CEO Andrew Feldman Built The World's Largest Computer Chip

21 May 2026

Transcription

Chapter 1: What makes Cerebras chips different from traditional chips?

0.031 - 16.099 Andrew Feldman

Odd Lots is brought to you by VanEck. For years, investors basically forgot about real assets, energy, gold, and infrastructure. But look what's driving markets now. Central banks loading up on gold, massive CapEx cycles, currencies doing weird things. These assets are at the center of it.

0

16.079 - 33.087 Andrew Feldman

Rax, the VanEck Real Asset ETF, is an actively managed one-stop shop for real assets spanning gold, commodities, natural resource equities, and more. Go to VanEck.com slash R-A-A-X pod to learn more. Fun disclosures later in this episode.

0

33.107 - 47.374 Joe Wiesenthal

Hello, I'm Stephen Carroll. I'm in Brussels, where many of Europe's biggest decisions get made. And I'm Caroline Hepke in London. We're the hosts of the Bloomberg Daybreak Europe podcast. We're up early every weekday, keeping an eye on what's happening across Europe and around the world.

0

47.794 - 59.549 Joe Wiesenthal

We do it early so the news is fresh, not recycled, and so you know what actually matters as the day gets going. From Brussels, I'm following the politics, policy and the people shaping the European Union right now.

0

60.11 - 65.036 Unknown

And from London, I'm looking at what all that means for markets, money and the wider economy.

65.557 - 70.183 Joe Wiesenthal

We've got reporters across Europe and around the globe feeding in as stories break.

70.231 - 74.957 Unknown

So whether it's geopolitics, energy, tech or markets, you're hearing it while it happens.

75.397 - 77.419 Joe Wiesenthal

It's smart, calm and to the point.

77.86 - 79.001 Unknown

And it fits into your morning.

Chapter 2: How does the size of Cerebras chips enhance AI performance?

137.647 - 159.823 Andrew Feldman

And many of them are sort of like swallowing up the other thoughts that I have in my head, whether it's questions about which model is best and why and what are the economics of inference and how much training is pre-training versus post-training for each model. Like, it's just sort of like this blob that's growing that's taking up more and more of my thoughts.

0

159.888 - 177.906 Andrew Feldman

What is your definition of AI psychosis? Because one would argue that maybe thinking about AI literally all the time would be a form of psychosis. Well, let's just say like, I'm not the type who thinks that like, I don't like think that the AI is a friend for one thing. I'm not in love with the AI models.

0

178.286 - 194.333 Andrew Feldman

I don't think that in collaboration with chat GPT that I'm stumbling on a unified theory of physics and things like that. So like, But you do spend a lot of time inputting instructions, pressing the button and seeing what comes out. And seeing what comes out.

0

194.453 - 207.303 Andrew Feldman

I'm just saying I think I'm aware that I'm talking to a machine and that we're not establishing any great breakthroughs of which we are collaborators and partners and friends. Recognizing you have a problem is the first step towards healing, Joe.

0

207.744 - 228.64 Andrew Feldman

Seriously, though, there's a good reason to think about AI more and more, which is that a huge chunk of not just the market, but the real economy is now revolving around AI, right? Totally. So anyway, again, within the AI conversation, there are a lot of subcategories. One of the subcategories happens to be another odd lot's favorite topic, which is chips.

228.62 - 249.786 Andrew Feldman

Of course, chips are used in multiple different ways. Chips are used in different parts of the AI supply chain, different types of chips have different roles. And so we have to learn more. We have to learn more. And I have to say, I'm particularly interested in the company we're about to speak to, partly because the two things I know about them are, number one, they just had a huge IPO, right?

249.906 - 275.314 Andrew Feldman

Raising something like $5.5 billion at kind of insane multiple. I can't even do a price to earnings multiple because- They're not profitable yet, but I think just on a sales basis, it was like 67 times forward earnings, which is pretty juicy, pretty hot. And the second thing I know about the company is they make giant wafers, which is just a fun image to have in your head. That's right.

275.414 - 290.725 Andrew Feldman

So if you were thinking it's like, OK, there is a hot entrant in this space. What is their differentiator? Well, one fact about them is their chips are just enormous about the size of the dinner plate. One might think you're reading an Onion article, but in fact, it's real.

Chapter 3: What challenges did Cerebras face in developing their technology?

290.865 - 308.498 Andrew Feldman

And apparently it actually has some real technical advantages. So. And it's different to what everyone else is doing. So everyone else is, I guess, doing this sort of like modular networking thing where you get together a bunch of chips and you connect them together. And that's how you get more compute, more memory, more power, basically.

0

308.959 - 326.384 Andrew Feldman

But this company has done something different in the form of the giant wafers. The giant wafer. And if you figure that to get maximum performance, you sort of want to lessen the distance between things, then put it all on one wafer. Anyway, we're going to learn a lot more. I'm very excited to say about giant wafers and more.

0

326.404 - 347.175 Andrew Feldman

I'm very excited to say we do have the founder and CEO of Cerebros on the podcast, Andrew Feldman, truly the perfect guest. So, Andrew, thank you so much for coming on the podcast on the week of your IPO. Well, thank you so much for having me. What a pleasure. Absolutely. Why don't you just start us off? The big giant chip, they're apparently real. They're as big as a dinner plate.

0

347.516 - 377.342 Andrew Feldman

What is the technical reason why this actually makes sense as a superior form of architecture for at least some aspect of AI? I think larger chips process more information in less time. Okay. And that produces faster results. And everybody had gone to bigger chips. NVIDIA had moved from 400 square millimeters to 800 square millimeters over the course of five or six years for this exact reason.

0

377.643 - 395.301 Andrew Feldman

And in the compute industry, wafer scale, which is building a chip this big. By the way, for those who are just listening, Andrew is now holding up the chip. And yes, it actually looks bigger than a dinner plate, to be honest. But that is a big chip. That's a big chip. That's a big chip.

Chapter 4: How does Cerebras' partnership with TSMC influence their production?

395.361 - 420.051 Andrew Feldman

It's beautiful. It's 58 times larger than any other chip that had ever been. Wow. And what it did... was it allowed us to use a different type of memory. Okay. A type of memory that, at the beginning there are two types of memory. There's memory that can store a lot, but it's really slow. Okay. And there's memory that can't store very much per square millimeter, but it's blisteringly fast. Okay.

0

420.392 - 442.268 Andrew Feldman

And historically, all graphics processing units use this memory that could store a lot, but was really slow. And that's the reason they do inference so slowly. So if you're using Claude right now or you're using anything but ChatGPT, what you'll frequently feel is you'll enter your prompt and you'll wait for an answer, right?

0

443.109 - 468.788 Andrew Feldman

And that's because the memory is slow and they have to move a ton of information from memory to compute. Now, by going to wafer scale, we could use this fast memory. Now, we couldn't make that memory store more information per square millimeter, but we could add square millimeters. And so by building this big chip, we were able to stuff it to the gills with this fast memory.

0

468.768 - 493.031 Andrew Feldman

And that's why we're 15 times faster than the fastest GPU. That's why on some problems, we're 50, 100, even 1,000 times faster than graphics processing units. Wait, can you explain how you actually managed to do this? Because I know there have been previous attempts to do wafer scale. And I seem to remember there was even like an early attempt in the 1980s or something to do it.

0

493.051 - 514.213 Andrew Feldman

How were you able to pull this off? Yeah, it was an ambitious undertaking. That's for sure. Every previous effort in the 75-year history of our industry had failed, including Gene Amdahl, who's sort of on the Mount Rushmore of compute in our industry. He failed sort of spectacularly in the mid-'80s at a company called Trilogy.

514.993 - 529.191 Andrew Feldman

Not only that, but after we succeeded, people who had visited us, who'd been in our labs, tried to copy us, and they also failed. And so what we were able to do is solve a set of really fundamental problems.

Chapter 5: What is the significance of the IPO for Cerebras?

529.251 - 554.796 Andrew Feldman

And those problems cut across a wide swath of technology. They cut across lithography. So we had to collaborate closely with TSMC and they turned out to be a great partner. We had to make inventions in material and packaging. That's how you put a processor, how you put a piece of silicon on a motherboard, deliver power and IO to it. We had to make inventions in power delivery.

0

555.538 - 577.431 Andrew Feldman

When you build a giant chip, you're going to deliver way more power to it than if you do a chip the size of a postage stamp. We had to invent ways to cool it. We had to write new types of software that ran on it. All of these had never been done before. And it was a decade-long process. It took us five years and about $500 million to deliver the first one.

0

578.253 - 603.366 Andrew Feldman

And it's been an extraordinary run since, in December. We signed a deal with OpenAI north of $20 billion, one of the largest contracts ever signed in Silicon Valley. And then in March, we signed a deal with AWS, where they would deploy our systems in their data centers, in their AWS data centers. And so it's just been an extraordinary run, but it took a long time.

0

603.386 - 622.459 Andrew Feldman

It took extraordinary engineering. And there were certainly long periods of time when it wasn't clear we were going to make this work. Obviously, you've hit this remarkable milestone. You have, in fact, IPO'd and so forth. And right now, markets valuing your company at $64 billion early days of the IPO.

0

622.479 - 644.704 Andrew Feldman

Just for the listener to understand, the chips, are they solely an inference as opposed to training? When we think about AI, we think about, OK, there's training, training the model, and then answer giving. That's the inference. Are the chips just for inference? So a couple of things. I think you framed it exactly right. Training is how we make AI. And inference is how we use AI.

645.705 - 654.719 Andrew Feldman

And so what happened was that in sort of 2025, in the first part of 2025, the models we made were smart enough to be useful.

Chapter 6: How do Cerebras chips handle inference compared to training?

655.58 - 682.898 Andrew Feldman

And there was an explosion of use. And we use AI by doing inference. So there was this sort of tidal wave of demand on inference. And that has continued in 2026, and we think it will continue for years and years to come. And so that's what had happened. In 2015, when we began thinking about the company, we knew that AI was on the horizon and it would eat a huge amount of computers.

0

682.878 - 707.451 Andrew Feldman

right and we we made sort of two fundamental bets we bet that it would need dedicated silicon And graphics had needed dedicated silicon. That's how you got the graphics processing unit. Mobile compute had needed dedicated compute. That's where you got ARM processors. We made that bet. And we made a bet that modifying the GPU architecture wouldn't be right.

0

707.492 - 721.103 Andrew Feldman

You needed to start with a clean sheet of paper. And so what we started with was a new vision. And that vision could do training. And it could do inference. And it was orders of magnitude faster at both.

0

722.23 - 739.757 Andrew Feldman

But right now, what we're seeing is such an explosion in demand for inference that a lot of the business this minute is inference, even though we're just as fast at the same amount faster than GPUs on training. That's interesting. Maybe we'll get more to the theoretical training market a little later.

0

740.138 - 763.51 Andrew Feldman

Just real quick on inference, Ben Thompson, who writes a newsletter about tech, he wrote a piece in which he distinguishes between answer inference and agentic inference. So answer inference is like, you know, format my resume or whatever, or write me an essay on X or Y or answer some questions. And then agentic inference is like, okay, here's this thing that's going to go around

763.49 - 778.404 Andrew Feldman

Do you distinguish, and do services for you, not producing visual answers, do you distinguish between those two? Is that a real divide in your view? And can your chips do both? Our chips can do both. I think it is a divide. Okay.

Chapter 7: What impact do giant wafers have on AI economics?

778.424 - 802.39 Andrew Feldman

I think speed matters equally in both. Okay. I think if you are engaged with the AI, if you're writing code, which is agentic, if you're writing code or you're doing work, nobody wants to wait. I mean, we could just turn the question around and say, well, how big is the market for slow search? Zero. How big is the market for dial-up internet? Zero. Why is that?

0

802.651 - 829.741 Andrew Feldman

Because nobody wants to wait, right? So if you're engaged with the AI, speed is of the essence. But if the AI is doing agentic work and your competitor gets three times, five times, 10 times as much work done in 20 minutes than you do, you're going to get smoked. And so this notion somehow that Ben proposed that speed isn't very important in agentic flows is dead wrong.

0

830.622 - 859.762 Andrew Feldman

That speed is important in all aspects of productive work and that your ability to get more done in less time is a fundamental advantage that accrues over time, right? If while your competitor is doing one unit of work, you can do three. And in the next time they do one unit of work, you do six. This adds up over time and you beat them in any line of work.

0

860.323 - 871.585 Andrew Feldman

And so speed, which is sort of our specialty, is important across the board. What do giant wafers and speed in general actually mean for, I guess, the economics of tokens?

0

871.665 - 894.574 Andrew Feldman

Because one way I think about it, I have this sort of vision in my head like, okay, if I'm out shopping for toothpaste, I know I need toothpaste every once in a while and I go into like a CVS, a store, I get one thing of toothpaste and then maybe a week later I get some more toothpaste. Or I could go to Costco and buy a giant thing of toothpaste and take it home probably at a cheaper cost.

894.594 - 923.657 Andrew Feldman

And that's sort of how I think of the giant wafers. Maybe it's a bad analogy. But what does speed actually mean for the cost of tokens? Well, I think there are a couple observations. I think people have chosen so far to price speed a little higher. For example, Anthropic offered a premium service in which they offered tokens twice as fast and charged six times as much. And they sold it out.

924.378 - 936.158 Andrew Feldman

And they couldn't meet the demand. Now, just to give you an idea, we're 15 times faster than they're twice as fast. And so people value speed because it allows them to do more work.

Chapter 8: How does Cerebras view the competition between closed and open-source AI models?

937.049 - 964.457 Andrew Feldman

And they value their time. And when you can do more work in less time, you are making people more productive. That's why people have chosen to price them at a premium. They don't cost more to make. In fact, in the GPU architecture is an extremely good architecture and extremely efficient at building very slow tokens. And if you don't mind slow, the cost per token on a GPU is extremely low.

0

965.247 - 991.643 Andrew Feldman

But the GPU has a characteristic that as you try and go faster, the cost and the power used per token increase. Sort of like as you go faster in your car, your miles per gallon decrease, right? So what happens is as you try and get fast enough to be useful, fast enough to be interesting, fast enough to keep users' intelligence focused,

0

991.775 - 1014.503 Andrew Feldman

on this product, they become extremely expensive and extremely power hungry. And so the question is, is not just what people are paying for a token, what people are choosing to price them at, but what they actually cost to make. And GPUs make very slow tokens very cheaply, and they're unbelievably expensive at fast tokens.

0

1014.663 - 1046.68 Andrew Feldman

We make fast tokens vastly less expensive than GPUs, and we use a tiny fraction of the power. Data centers need electricity, AI needs copper, reshoring needs steel, and gold's run may tell you something about how the world is repricing money and debt. All of those point back to real assets.

0

1047.141 - 1066.668 Andrew Feldman

The RACS ETF is an actively managed, one-stop real asset shop from gold to commodities to natural resource equities, adjusting as conditions change. Visit VanEck.com slash RAAXpod to learn more. An investor should consider the investment objective risks, charges, and expenses of the fund before investing.

1066.908 - 1080.743 Andrew Feldman

To obtain a prospectus and summary prospectus, which contains this and other information, visit VanEck.com. Please read the prospectus and summary prospectus carefully before investing. RACS is distributed by VanEck Securities Corporation Distributor.

1081.128 - 1111 Unknown

on Apple Podcasts, Spotify, or anywhere you listen.

1111.604 - 1132.802 Andrew Feldman

Let's say we stipulate that this is all true and everyone wants the fastest and everyone's like, you know what? This is the solution that the Cerebras technology, one big chip. This is really where it's at. How much of like your market share for the inference market when you look out next year, the year after, etc. ?

1132.782 - 1158.726 Andrew Feldman

How much is your market share going to be dictated by your ability to get capacity at TSMC fabs? How much is that a gating mechanism for growth? You know, TSMC is a huge part of the supply chain. Yeah. But we have some real advantages. There are three areas right now that are limiting vendors and building AI computes. Number one is HBM memory.

Comments

There are no comments yet.

Please log in to write the first comment.