Ryan Worrell

The Changelog: Software Development, Open Source

Reinventing Kafka on object storage (Interview)

So while we understand that we can't hit every possible application in the market with the shape that Workstream is today, we're pretty happy with the set of use cases and workloads that we can target because there are just so many of them out there and they happen to align with the budget-sensitive ones.

2685.286 View full episode →

The Changelog: Software Development, Open Source

Reinventing Kafka on object storage (Interview)

So while we understand that we can't hit every possible application in the market with the shape that Workstream is today, we're pretty happy with the set of use cases and workloads that we can target because there are just so many of them out there and they happen to align with the budget-sensitive ones.

2685.286 View full episode →

The Changelog: Software Development, Open Source

Reinventing Kafka on object storage (Interview)

So the writes are around 500 milliseconds at the P99. That's tunable. By default, we have the agent buffer set.

2715.654 View full episode →

The Changelog: Software Development, Open Source

Reinventing Kafka on object storage (Interview)

So the writes are around 500 milliseconds at the P99. That's tunable. By default, we have the agent buffer set.

2715.654 View full episode →

The Changelog: Software Development, Open Source

Reinventing Kafka on object storage (Interview)

the records that your clients are sending in memory for 250 milliseconds before writing them to object storage, so that you just write fewer files to object storage, which is the primary determinant of the cost of the object storage component of the system, if you're not retaining the data for very long.

2725.356 View full episode →

The Changelog: Software Development, Open Source

Reinventing Kafka on object storage (Interview)

the records that your clients are sending in memory for 250 milliseconds before writing them to object storage, so that you just write fewer files to object storage, which is the primary determinant of the cost of the object storage component of the system, if you're not retaining the data for very long.

2725.356 View full episode →

The Changelog: Software Development, Open Source

Reinventing Kafka on object storage (Interview)

But you can shrink that down all the way to 50 milliseconds, in which case then 10 latency, or sorry, the produced latency at that point would be probably ballpark 300 milliseconds at the P99.

2739.941 View full episode →

The Changelog: Software Development, Open Source

Reinventing Kafka on object storage (Interview)

But you can shrink that down all the way to 50 milliseconds, in which case then 10 latency, or sorry, the produced latency at that point would be probably ballpark 300 milliseconds at the P99.

2739.941 View full episode →

The Changelog: Software Development, Open Source

Reinventing Kafka on object storage (Interview)

For, I said end-to-end instead of read, because that's typically what people talk about in Kafka terms, because they wanna know like a producer sends a message, how long does it take until a consumer can consume that message successfully? So that's what I mean by end-to-end, and that is one to one and a half seconds of the P99 for most our users.

2751.175 View full episode →

The Changelog: Software Development, Open Source

Reinventing Kafka on object storage (Interview)

For, I said end-to-end instead of read, because that's typically what people talk about in Kafka terms, because they wanna know like a producer sends a message, how long does it take until a consumer can consume that message successfully? So that's what I mean by end-to-end, and that is one to one and a half seconds of the P99 for most our users.

2751.175 View full episode →

The Changelog: Software Development, Open Source

Reinventing Kafka on object storage (Interview)

So there really aren't that many downsides other than the latency. The latency is what actually enables all of the benefits of WarpStream, basically. The object storage is what enables a lot of the benefits. We have a couple of interesting features that are based on the fact that all of the data is in object storage. One of them we call agent groups.

2779.952 View full episode →

The Changelog: Software Development, Open Source

Reinventing Kafka on object storage (Interview)

So there really aren't that many downsides other than the latency. The latency is what actually enables all of the benefits of WarpStream, basically. The object storage is what enables a lot of the benefits. We have a couple of interesting features that are based on the fact that all of the data is in object storage. One of them we call agent groups.

2779.952 View full episode →

The Changelog: Software Development, Open Source

Reinventing Kafka on object storage (Interview)

And Azure Groups let you take one logical cluster and split it up physically amongst a bunch of different domains. They could be like different VPCs within the same cloud account. It could be different cloud accounts. They could be different cloud accounts or same cloud account, but across regions. all by just sharing the IAM role for the object storage bucket between those different accounts.

2803.462 View full episode →

The Changelog: Software Development, Open Source

Reinventing Kafka on object storage (Interview)

And Azure Groups let you take one logical cluster and split it up physically amongst a bunch of different domains. They could be like different VPCs within the same cloud account. It could be different cloud accounts. They could be different cloud accounts or same cloud account, but across regions. all by just sharing the IAM role for the object storage bucket between those different accounts.

2803.462 View full episode →

The Changelog: Software Development, Open Source

Reinventing Kafka on object storage (Interview)

The alternative to this with open source Kafka is like setting up something crazy like VPC peering, which is extremely hard to do. And your security team will probably not be super happy if you try to ask them to peer a bunch of VPCs together because it introduces more security risks.

2829.628 View full episode →

The Changelog: Software Development, Open Source

Reinventing Kafka on object storage (Interview)

The alternative to this with open source Kafka is like setting up something crazy like VPC peering, which is extremely hard to do. And your security team will probably not be super happy if you try to ask them to peer a bunch of VPCs together because it introduces more security risks.

2829.628 View full episode →

The Changelog: Software Development, Open Source

Reinventing Kafka on object storage (Interview)

So we have customers in production using this feature today, where the example that we usually give is there's a games company that splits their production games account, where all the game servers run, from the analytics account, where they do like the, so they run a bunch of flank jobs to process the data generated from the production account.

2846.747 View full episode →

The Changelog: Software Development, Open Source

Reinventing Kafka on object storage (Interview)

So we have customers in production using this feature today, where the example that we usually give is there's a games company that splits their production games account, where all the game servers run, from the analytics account, where they do like the, so they run a bunch of flank jobs to process the data generated from the production account.

2846.747 View full episode →

The Changelog: Software Development, Open Source

Reinventing Kafka on object storage (Interview)

And they run agents that just do produce, so just writes. They run that in the production account. And they run agents that just do fetch inside their analytics account. So they've kind of flexed the cluster across those two different environments. And all they had to do to set that up was share the IAM role on the object storage bucket instead of peering the VPCs together.

2867.651 View full episode →

The Changelog: Software Development, Open Source

Reinventing Kafka on object storage (Interview)

And they run agents that just do produce, so just writes. They run that in the production account. And they run agents that just do fetch inside their analytics account. So they've kind of flexed the cluster across those two different environments. And all they had to do to set that up was share the IAM role on the object storage bucket instead of peering the VPCs together.

2867.651 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment