Benedict Evans
👤 PersonAppearances Over Time
Podcast Appearances
But there's a reset both of what the product is and how you sell it and your org structure around selling it.
And do you have the right politics and the right org structure to build that and the right incentives and internal conflicts?
And then the consumer behavior kind of gets reset as well.
So I think it's actually the opposite, which is that everyone's kind of using the same data, which is you need such an enormous amount of generalized text.
that the amount that Google has or that Meta has is not actually enough to be a kind of a fundamental difference in what you can train with.
Push back, yeah.
So it depends.
So the models that we're training now, we're training on text.
So that's not really being trained on YouTube.
We saw this lawsuit around book copyright with Meta that they downloaded a torrent of pirated books.
Because guess what?
They don't have enough text and it's not the right kind of text.
They don't have lots of prose.
They've got lots of short snippets of text.
So I think the generality of LLMs is you just need such an enormous amount of data that everyone kind of needs all the text that there is.
And all the text that there is is kind of equally available to anyone.
Yes, because you need so much more.
And it's also not necessarily the kind of data that you have.
So obviously Google has, you know, an enormous repository of scraped data because they read the web all the time.
But anyone else with a billion dollars can go and do that.