Martin Casado
๐ค SpeakerAppearances Over Time
Podcast Appearances
If you want to see, and that therefore, it's very important that you choose a physical region in the world where all your data is going to live, in a specific region of a specific cloud, and then you build all of your data-consuming services in that same location.
Or you can partition it.
Well, this is avers.
The term data gravity does get used to mean multiple things.
This is one particular incarnation of the idea of data gravity.
And this is the one that I am saying is fake, that egress charges are so important.
And if you want to see evidence against this, come look at the networking dashboard of five trans various AWS and GCP accounts.
You will be astonished, despite replicating huge data sets,
for thousands of companies.
We have 7,000 customers of size and thousands more little ones.
The amount of data being moved at any given time is tiny.
And the reason is that we're doing change data capture.
You can have a huge data set, but if you just replicate the changes, the changes are always much smaller than people think.
And I think that a lot of this idea of data gravity came from dumb data pipelines that people wrote where they would copy their entire company's data sets
out of their database every day, once a day at midnight.
And so they just had this crazy read amplification.
You know, they were just repeatedly copying the same data over and over, and it gave them the impression that they had so much data, but they really don't.
This is what happens when people roll their own data pipelines is they fall back on these patterns that are easy to get right.