Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing
Podcast Image

Confluent Developer ft. Tim Berglund, Adi Polak & Viktor Gamov

Reddit Sentiment Analysis with Apache Kafka-Based Microservices

08 Sep 2022

Description

How do you analyze Reddit sentiment with Apache Kafka® and microservices? Bringing the fresh perspective of someone who is both new to Kafka and the industry, Shufan Liu, nascent Developer Advocate at Confluent, discusses projects he has worked on during his summer internship—a Cluster Linking extension to a conceptual data pipeline project, and a microservice-based Reddit sentiment-analysis project. Shufan demonstrates that it’s possible to quickly get up to speed with the tools in the Kafka ecosystem and to start building something productive early on in your journey.Shufan's Cluster Linking project extends a demo by Danica Fine (Senior Developer Advocate, Confluent) that uses a Kafka-based data pipeline to address the challenge of automatic houseplant watering. He discusses his contribution to the project and shares details in his blog—Data Enrichment in Existing Data Pipelines Using Confluent Cloud.The second project Shufan presents is a sentiment analysis system that gathers data from a given subreddit, then assigns the data a sentiment score. He points out that its results would be hard to duplicate manually by simply reading through a subreddit—you really need the assistance of AI. The project consists of four microservices:A user input service that collects requests in a Kafka topic, which consist of the desired subreddit, along with the dates between which data should be collectedAn API polling service that fetches the requests from the user input service, collects the relevant data from the Reddit API, then appends it to a new topicA sentiment analysis service that analyzes the appended topic from the API polling service using the Python library NLTK; it calculates averages with ksqlDBA results-displaying service that consumes from a topic with the calculationsInteresting subreddits that Shufan has analyzed for sentiment include gaming forums before and after key releases; crypto and stock trading forums at various meaningful points in time; and sports-related forums both before the season and several games into it. EPISODE LINKSData Enrichment in Existing Data Pipelines Using Confluent CloudWatch the video version of this podcastKris Jenkins’ TwitterStreaming Audio Playlist Join the Confluent CommunityLearn more with Kafka tutorials, resources, and guides at Confluent DeveloperLive demo: Intro to Event-Driven Microservices with ConfluentSEASON 2 Hosted by Tim Berglund, Adi Polak and Viktor Gamov Produced and Edited by Noelle Gallagher, Peter Furia and Nurie Mohamed Music by Coastal Kites Artwork by Phil Vo 🎧 Subscribe to Confluent Developer wherever you listen to podcasts. ▶️ Subscribe on YouTube, and hit the 🔔 to catch new episodes. 👍 If you enjoyed this, please leave us a rating. 🎧 Confluent also has a podcast for tech leaders: "Life Is But A Stream" hosted by our friend, Joseph Morais.

Audio
Featured in this Episode

No persons identified in this episode.

Transcription

This episode hasn't been transcribed yet

Help us prioritize this episode for transcription by upvoting it.

0 upvotes
🗳️ Sign in to Upvote

Popular episodes get transcribed faster

Comments

There are no comments yet.

Please log in to write the first comment.