Ryan Worrell
๐ค PersonAppearances Over Time
Podcast Appearances
Apache Kafka?
Apache Kafka?
Yeah. The project is managed by the Apache Foundation and has a variety of contributors across a ton of companies. And I would say it's a fairly healthy example of an open source product in terms of like having a big community.
Yeah. The project is managed by the Apache Foundation and has a variety of contributors across a ton of companies. And I would say it's a fairly healthy example of an open source product in terms of like having a big community.
So there are a lot of practical challenges with improving a large open source project with a lot of users and a lot of dependent parties, I should say. Not even necessarily just users, but stakeholders of all kinds. Making large sweeping changes is essentially impossible. It's not. The amount of code churn required to...
So there are a lot of practical challenges with improving a large open source project with a lot of users and a lot of dependent parties, I should say. Not even necessarily just users, but stakeholders of all kinds. Making large sweeping changes is essentially impossible. It's not. The amount of code churn required to...
take open source Kafka and get it to something resembling the architecture of Workstream is just not going to, that's not going to happen in any reasonable amount of time. That's the first part. If you just wanted to abstractly, no financial interests involved, how would you do this? It would be very hard, practically.
take open source Kafka and get it to something resembling the architecture of Workstream is just not going to, that's not going to happen in any reasonable amount of time. That's the first part. If you just wanted to abstractly, no financial interests involved, how would you do this? It would be very hard, practically.
The second reason is that WarpStream makes a pretty different set of trade-offs than the open source project does in terms of the environment that we expect users to run in. Now, I think those trade-offs are correct for the world that exists today, but in the abstract, it is different than the open source project. So WarpStream stores data only in object storage. That's step one.
The second reason is that WarpStream makes a pretty different set of trade-offs than the open source project does in terms of the environment that we expect users to run in. Now, I think those trade-offs are correct for the world that exists today, but in the abstract, it is different than the open source project. So WarpStream stores data only in object storage. That's step one.
You need an environment that has object storage. And then step two is that we run a control plane for the cluster, which in the open source, the comparison would be kind of like if somebody was running Zookeeper or Kraft, which is their replacement for Zookeeper inside of the open source project.
You need an environment that has object storage. And then step two is that we run a control plane for the cluster, which in the open source, the comparison would be kind of like if somebody was running Zookeeper or Kraft, which is their replacement for Zookeeper inside of the open source project.
It's kind of as if we're running that for you remotely, and then you're running the agents, as we call them, which is the replacement for the Kafka broker. inside your cloud account. So just like there's a very specific topology that we're prescribing to our customers as well. That's different.
It's kind of as if we're running that for you remotely, and then you're running the agents, as we call them, which is the replacement for the Kafka broker. inside your cloud account. So just like there's a very specific topology that we're prescribing to our customers as well. That's different.
Probably wouldn't fly in an open source environment, or at least would make it even more challenging to run potentially. I think those are probably the two biggest reasons of why we couldn't just improve Kafka is just it would be too hard practically to make improvements. And then also we're
Probably wouldn't fly in an open source environment, or at least would make it even more challenging to run potentially. I think those are probably the two biggest reasons of why we couldn't just improve Kafka is just it would be too hard practically to make improvements. And then also we're
We're making trade-offs around what we think the world, like how we see the world existing today and how we think it's going to continue to exist in the future that a lot of the stakeholders to the OpenService product may not agree with our assessment there, basically.
We're making trade-offs around what we think the world, like how we see the world existing today and how we think it's going to continue to exist in the future that a lot of the stakeholders to the OpenService product may not agree with our assessment there, basically.
Yeah, the way that I like to explain that, the networking cost side, is that when you're renting space in a colo or you have your own data center, you're implicitly paying for what is kind of a fixed capacity resource. It has a very high fixed capacity, but you are essentially paying for a resource that has a fixed capacity without doing a bunch of capital improvements to your data center.
Yeah, the way that I like to explain that, the networking cost side, is that when you're renting space in a colo or you have your own data center, you're implicitly paying for what is kind of a fixed capacity resource. It has a very high fixed capacity, but you are essentially paying for a resource that has a fixed capacity without doing a bunch of capital improvements to your data center.