Ryan Worrell
๐ค PersonAppearances Over Time
Podcast Appearances
The brokers own some set of partitions from a leadership perspective. And then there's also replicas of that that are just copying the data. And it's just other brokers that are the replicas for those partitions. So the broker will write that data that it receives from a producer client down to the local disk and replicate it out to the followers. And then
The brokers own some set of partitions from a leadership perspective. And then there's also replicas of that that are just copying the data. And it's just other brokers that are the replicas for those partitions. So the broker will write that data that it receives from a producer client down to the local disk and replicate it out to the followers. And then
a consumer can come along and read either from a replica or the leader the data that producer wrote. But they're all coordinating on essentially one of those brokers owns the partition specifically that I'm interested in and reading. So that's how it works in the open source product and in Warp stream, we've decoupled the idea of ownership of a partition from the broker itself.
a consumer can come along and read either from a replica or the leader the data that producer wrote. But they're all coordinating on essentially one of those brokers owns the partition specifically that I'm interested in and reading. So that's how it works in the open source product and in Warp stream, we've decoupled the idea of ownership of a partition from the broker itself.
We have a metadata store that runs inside our control plane that has a mapping of, here are all the files and object storage. And within those files, the data for this partition for this offset is here. It's in some section of a file in object storage.
We have a metadata store that runs inside our control plane that has a mapping of, here are all the files and object storage. And within those files, the data for this partition for this offset is here. It's in some section of a file in object storage.
So any of our agents, which are like the stateless broker that speaks the Kafka protocol to your clients, any one of those agents can consult the metadata store and ask, I want to read this topic partition at offset X. Where do I have to go in object storage and potentially multiple places in object storage? Where do I have to go in object storage to read that data?
So any of our agents, which are like the stateless broker that speaks the Kafka protocol to your clients, any one of those agents can consult the metadata store and ask, I want to read this topic partition at offset X. Where do I have to go in object storage and potentially multiple places in object storage? Where do I have to go in object storage to read that data?
But because the metadata store inside the control plane is handling the ordering aspect of it, essentially, you get the same guarantees as Kafka in terms of I have this message with this key that's routed to this topic partition, and I want them to stay in the same order because I'm writing them in a specific order. That ordering part is enforced by the metadata store inside the control plane.
But because the metadata store inside the control plane is handling the ordering aspect of it, essentially, you get the same guarantees as Kafka in terms of I have this message with this key that's routed to this topic partition, and I want them to stay in the same order because I'm writing them in a specific order. That ordering part is enforced by the metadata store inside the control plane.
But the data plane part of actually moving all of those messages around is only inside the agents and object storage. So it lets you do that thing that I was saying before, where if you want to scale up and down, it's very easy to do that because you don't have to rebalance those partitions, which take up space on the local disk amongst the brokers in order to facilitate that.
But the data plane part of actually moving all of those messages around is only inside the agents and object storage. So it lets you do that thing that I was saying before, where if you want to scale up and down, it's very easy to do that because you don't have to rebalance those partitions, which take up space on the local disk amongst the brokers in order to facilitate that.
In terms of being faster, it's faster at the fact that there is no rebalancing that happens. Because the data is always just in object storage somewhere. You don't have to do any rebalancing for it. That part of it is faster. There's obviously a trade-off when you do this in that the latency of writing to object storage is higher than writing to the local disk.
In terms of being faster, it's faster at the fact that there is no rebalancing that happens. Because the data is always just in object storage somewhere. You don't have to do any rebalancing for it. That part of it is faster. There's obviously a trade-off when you do this in that the latency of writing to object storage is higher than writing to the local disk.
So if you want your data to be durable, you have to wait for the data to be written to object storage first. So that's the primary trade-off somebody that's using Warpstream would be making is that they're comfortable with around 500 milliseconds at the P99 of latency to write data to the system.
So if you want your data to be durable, you have to wait for the data to be written to object storage first. So that's the primary trade-off somebody that's using Warpstream would be making is that they're comfortable with around 500 milliseconds at the P99 of latency to write data to the system.
And then the end-to-end latency of like a producer sends data and then it's consumed by a consumer is somewhere between one to one and a half seconds again at the P99.
And then the end-to-end latency of like a producer sends data and then it's consumed by a consumer is somewhere between one to one and a half seconds again at the P99.
So it's interesting that you use that word real-time because we've talked to a ton of different Kafka users. And when you ask them, what is your end-to-end latency of your system today? A lot of them don't know the answer. They think that they know the answer. Well, it's real-time. Yeah, they're either not measuring it, where they're measuring it in a weird and incorrect way.
So it's interesting that you use that word real-time because we've talked to a ton of different Kafka users. And when you ask them, what is your end-to-end latency of your system today? A lot of them don't know the answer. They think that they know the answer. Well, it's real-time. Yeah, they're either not measuring it, where they're measuring it in a weird and incorrect way.