Jack
👤 SpeakerAppearances Over Time
Podcast Appearances
In 2024, Anthropix wanted to differentiate their chatbot, make it smarter, make it book smarter.
So they launched Project Panama, a secret goal to buy a million physical books, scan the pages of those books, and feed the digital text into their large language model.
Yes, this sounds like the plot of like a nerdy James Bond, but it is real and here is how it went down.
First, they did a really smart thing.
They hired the guy who like 20 years ago did the Google Books project of scanning millions of books into Google.
Yeah, remember that like free service we had in college?
Well, that guy started working with Anthropic and bought a million used books and destroyed them by slicing off the spines.
He split Tolstoy in two.
Which begs the question, why destroy books?
Are you anti-book, Anthropic?
The reason they bought up the books and destroyed them, Jack, was because it's hard to scan books in a photocopier.
Like, we've all been there.
It's easier to scan a book if it's just a stack of pages that are not bound to each other.
I mean, honestly, we'd love to talk to the guy Anthropic hired to spend six months destroying and copying books.
Yeah, really weird gap year that that guy did destroying those books and scanning them.
Don't tell anyone.
It's all being done in a top secret warehouse.
Right.
So the reason we know about this is that authors and publishers have sued Anthropic in court arguing that feeding these books into the LLM is an unlawful use of their books.
So this court case, it even revealed an actual photo of a giant warehouse with millions of de-spined books.