Martin Kleppmann
๐ค SpeakerAppearances Over Time
Podcast Appearances
Some people focus on the lower level things and treat the higher level aspects as their customers.
Interesting.
And I would say like the underlying philosophy of the entire book is to give people insights into just the sort of essence of how the systems work internally.
So that if, for example, they start having weird performance behavior, you can have a bit of intuition for why it's doing that and how you might solve it.
So, for example, say the storage engine chapter tells you about how B-trees work and how log structured LSM trees storage engines work.
And the book is not intended for people who are going to actually build their own databases and implement their own storage engines.
If you want to do that, you have to go much, much more, much greater depth than this book covers.
But the idea is that as an app developer,
If you know just a little bit about how the storage engine works internally, you'll be in a much better place to use it in a way that gives you good performance, for example, and to diagnose any issues.
That philosophy we've kept also in the context of cloud services where, yes, like cloud service hides some of the operational details that app developers don't need to think about anymore, but they should still know a bit about how they work internally just so that they can use them effectively.
Exactly.
And, you know, there are huge differences of, say, if you're doing analytics, whether you're using row-oriented storage or column-oriented storage.
That's a bit of a technical distinction and...
It takes a little bit of background reading to even understand what that means, but it has a massive performance implication in terms of the final behavior of the system.
And so those are those places where I feel like knowing a bit about the internals is actually like a superpower.
the basic idea there seems to be like how much availability risk are you willing to take on versus the both like the overheads in terms of um the system itself like the computational overheads but also the human overheads actually designing and operating the system and and the cost overhead yeah exactly and so yes you can have a a system that is more able to tolerate various types of faults but it which is more expensive to uh to design and operate
versus a simpler system that might go down a bit more often, but which is cheaper.
And there's no right and wrong with that.
Everyone needs to figure out where they sit on that trade-off space themselves.
And I would say that multi-region is pushing in the direction of higher availability because it means you could tolerate the outage of an entire region.