Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing
Podcast Image

Confluent Developer ft. Tim Berglund, Adi Polak & Viktor Gamov

Real-Time Stream Processing, Monitoring, and Analytics With Apache Kafka

15 Sep 2022

Description

Processing real-time event streams enables countless use cases big and small. With a day job designing and building highly available distributed data systems, Simon Aubury (Principal Data Engineer, Thoughtworks) believes stream-processing thinking can be applied to any stream of events. In this episode, Simon shares his Confluent Hackathon ’22 winning project—a wildlife monitoring system to observe population trends over time using a Raspberry Pi, along with Apache Kafka®, Kafka Connect, ksqlDB, TensorFlow Lite, and Kibana. He used the system to count animals in his Australian backyard and perform trend analysis on the results. Simon also shares ideas on how you can use these same technologies to help with other real-world challenges.Open-source, object detection models for TensorFlow, which appropriately are collected into "model zoos," meant that Simon didn't have to provide his own object identification as part of the project, which would have made it untenable. Instead, he was able to utilize the open-source models, which are essentially neural nets pretrained on relevant data sets—in his case, backyard animals.Simon's system, which consists of around 200 lines of code, employs a Kafka producer running a while loop, which connects to a camera feed using a Python library. For each frame brought down, object masking is applied in order to crop and reduce pixel density, and then the frame is compared to the models mentioned above. A Python dictionary containing probable found objects is sent to a Kafka broker for processing; the images themselves aren't sent. (Note that Simon's system is also capable of alerting if a specific, rare animal is detected.) On the broker, Simon uses ksqlDB and windowing to smooth the data in case the frames were inconsistent for some reason (it may look back over thirty seconds, for example, and find the highest number of animals per type). Finally, the data is sent to a Kibana dashboard for analysis, through a Kafka Connect sink connector. Simon’s system is an extremely low-cost system that can simulate the behaviors of more expensive, proprietary systems. And the concepts can easily be applied to many other use cases. For example, you could use it to estimate traffic at a shopping mall to gauge optimal opening hours, or you could use it to monitor the queue at a coffee shop, counting both queued patrons as well as impatient patrons who decide to leave because the queue is too long.EPISODE LINKSReal-Time Wildlife Monitoring with Apache KafkaWildlife Monitoring GithubksqlDB Fundamentals: How Apache Kafka, SQL, and ksqlDB Work TogetherEvent-Driven Architecture - Common Mistakes and Valuable LessonsSEASON 2 Hosted by Tim Berglund, Adi Polak and Viktor Gamov Produced and Edited by Noelle Gallagher, Peter Furia and Nurie Mohamed Music by Coastal Kites Artwork by Phil Vo 🎧 Subscribe to Confluent Developer wherever you listen to podcasts. ▶️ Subscribe on YouTube, and hit the 🔔 to catch new episodes. 👍 If you enjoyed this, please leave us a rating. 🎧 Confluent also has a podcast for tech leaders: "Life Is But A Stream" hosted by our friend, Joseph Morais.

Audio
Featured in this Episode

No persons identified in this episode.

Transcription

This episode hasn't been transcribed yet

Help us prioritize this episode for transcription by upvoting it.

0 upvotes
🗳️ Sign in to Upvote

Popular episodes get transcribed faster

Comments

There are no comments yet.

Please log in to write the first comment.