Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Mark Graham

👤 Person
297 total appearances

Appearances Over Time

Podcast Appearances

Decoder with Nilay Patel
How the Wayback Machine is fighting linkrot

We call it the Wayback Machine as if it's like a computer that's sitting on somebody's desk. It's actually a whole network of literally hundreds of nodes as part of our overall infrastructure of the Internet Archive of thousands of nodes. more than 100 petabyte of material growing at the rate of more than 60 terabyte a day.

Decoder with Nilay Patel
How the Wayback Machine is fighting linkrot

It's a combination of applications that do what's referred to as crawling, which is a process of looking at a URL, looking at a webpage, and then looking at all of the other links, all of the other URLs on that page, and then going to them and then looking at them and then going on and on and on, crawling the web like a spider, metaphorically.

Decoder with Nilay Patel
How the Wayback Machine is fighting linkrot

It's a combination of applications that do what's referred to as crawling, which is a process of looking at a URL, looking at a webpage, and then looking at all of the other links, all of the other URLs on that page, and then going to them and then looking at them and then going on and on and on, crawling the web like a spider, metaphorically.

Decoder with Nilay Patel
How the Wayback Machine is fighting linkrot

It's a combination of applications that do what's referred to as crawling, which is a process of looking at a URL, looking at a webpage, and then looking at all of the other links, all of the other URLs on that page, and then going to them and then looking at them and then going on and on and on, crawling the web like a spider, metaphorically.

Decoder with Nilay Patel
How the Wayback Machine is fighting linkrot

So it's a combination of this crawling and archiving process, as well as the aggregation of all of those archived resources with indexes that makes those discoverable. And then they can be recompiled into web pages. And then patrons, millions of patrons a day come to our sites and they request resources that we have.

Decoder with Nilay Patel
How the Wayback Machine is fighting linkrot

So it's a combination of this crawling and archiving process, as well as the aggregation of all of those archived resources with indexes that makes those discoverable. And then they can be recompiled into web pages. And then patrons, millions of patrons a day come to our sites and they request resources that we have.

Decoder with Nilay Patel
How the Wayback Machine is fighting linkrot

So it's a combination of this crawling and archiving process, as well as the aggregation of all of those archived resources with indexes that makes those discoverable. And then they can be recompiled into web pages. And then patrons, millions of patrons a day come to our sites and they request resources that we have.

Decoder with Nilay Patel
How the Wayback Machine is fighting linkrot

Maybe it's a digitized version of a book from archive.org, or maybe it's a archived web page from the Wayback Machine. And then we will present that to them in their browser.

Decoder with Nilay Patel
How the Wayback Machine is fighting linkrot

Maybe it's a digitized version of a book from archive.org, or maybe it's a archived web page from the Wayback Machine. And then we will present that to them in their browser.

Decoder with Nilay Patel
How the Wayback Machine is fighting linkrot

Maybe it's a digitized version of a book from archive.org, or maybe it's a archived web page from the Wayback Machine. And then we will present that to them in their browser.

Decoder with Nilay Patel
How the Wayback Machine is fighting linkrot

More than that, yeah. Actually, it's something like more than a billion URLs every single day, and that can get pretty quick. It could be like 20,000 URLs a second can be coming into our server. So think of a database that you're writing to 20,000 times a second and you're reading from 5,000 times a second. That's one view into what the Wayback Machine is.

Decoder with Nilay Patel
How the Wayback Machine is fighting linkrot

More than that, yeah. Actually, it's something like more than a billion URLs every single day, and that can get pretty quick. It could be like 20,000 URLs a second can be coming into our server. So think of a database that you're writing to 20,000 times a second and you're reading from 5,000 times a second. That's one view into what the Wayback Machine is.

Decoder with Nilay Patel
How the Wayback Machine is fighting linkrot

More than that, yeah. Actually, it's something like more than a billion URLs every single day, and that can get pretty quick. It could be like 20,000 URLs a second can be coming into our server. So think of a database that you're writing to 20,000 times a second and you're reading from 5,000 times a second. That's one view into what the Wayback Machine is.

Decoder with Nilay Patel
How the Wayback Machine is fighting linkrot

Yes, the heading purchase is always with Seagate and others. We buy a lot of hard drives.

Decoder with Nilay Patel
How the Wayback Machine is fighting linkrot

Yes, the heading purchase is always with Seagate and others. We buy a lot of hard drives.

Decoder with Nilay Patel
How the Wayback Machine is fighting linkrot

Yes, the heading purchase is always with Seagate and others. We buy a lot of hard drives.

Decoder with Nilay Patel
How the Wayback Machine is fighting linkrot

The primary storage medium is spinning disk. I think today we're using 20 terabyte drives. When we started, they were much smaller, of course. Actually, the very, very, very first version of the Wayback Machine, going back almost like 24, 25 years ago, I think we used a tape machine for a little while. But very quickly, our founder, Brewster Kahle, decided that he really wanted

Decoder with Nilay Patel
How the Wayback Machine is fighting linkrot

The primary storage medium is spinning disk. I think today we're using 20 terabyte drives. When we started, they were much smaller, of course. Actually, the very, very, very first version of the Wayback Machine, going back almost like 24, 25 years ago, I think we used a tape machine for a little while. But very quickly, our founder, Brewster Kahle, decided that he really wanted

Decoder with Nilay Patel
How the Wayback Machine is fighting linkrot

The primary storage medium is spinning disk. I think today we're using 20 terabyte drives. When we started, they were much smaller, of course. Actually, the very, very, very first version of the Wayback Machine, going back almost like 24, 25 years ago, I think we used a tape machine for a little while. But very quickly, our founder, Brewster Kahle, decided that he really wanted

Decoder with Nilay Patel
How the Wayback Machine is fighting linkrot

the material that we have to be as accessible as possible to people so that when people wanted something that wasn't like, oh, we have to go back to the stacks and then find it and then get it. He wanted things to be as immediately available as possible. So spinning disks has been the primary format.