Plumbers of Data Science
Episodes
#126 The Cloud, Java, and the Future of Serverless – with Vadym Kazulkin
06 Aug 2025
Contributed by Lukas
In this podcast episode, I’m talking with Vadym Kazulkin, AWS Serverless Hero and Principal Cloud Architect.He’s part of the AWS community for y...
#125 AI Hype vs. Reality – with Christina Stathopoulos
22 Jul 2025
Contributed by Lukas
In this episode, I’m joined by Christina Stathopoulos, a former Googler who now works independently as a data & AI evangelist, trainer, and advi...
#124 These Skills Get You a Data Consulting Job – with Tom Schamberger
09 Jul 2025
Contributed by Lukas
In this episode, I’m talking with Tom Schamberger from the German consultancy msg. He leads their cloud data platform team and has a super interesti...
#123 Building Fast and Fun Data Projects - with Mehdi Ouazza
19 Jun 2025
Contributed by Lukas
In this episode, I sit down with Mehdi Ouazza - data tinkerer, indie hacker, and content creator - who's always up to something interesting in the...
#122 Why Writing Is Thinking , and What Data Engineers Can Learn from It - with Simon Späti
11 Jun 2025
Contributed by Lukas
In this podcast episode, I’m joined by Simon Späti, long-time BI and data engineering expert turned full-time technical writer and author of the l...
#121 From Application Dev to AWS Hero: A Journey in Tech & Impact - with Johannes Koch
19 May 2025
Contributed by Lukas
In this episode, I’m joined by Johannes Koch, Principal Engineer and AWS DevTools Hero, to talk about the real DevOps mindset, the evolution of de...
#120 Teaching Data Engineering Like It’s Done on the Job - with Deepak Goyal
08 May 2025
Contributed by Lukas
In this episode, I sit down with Deepak Goyal, the founder of AzureLib, to talk all things data engineering, cloud platforms, and how to teach the nex...
#119 Recruiting is harder than I thought
04 Nov 2024
Contributed by Lukas
In this episode of the Plumbers of Data Science podcast, I dive into the challenges of recruiting today, from overwhelming job application volumes to ...
#118 Freelancing as a Data Engineer - Hero Talk with the "Seattle Data Guy" Ben Rogojan
25 Oct 2024
Contributed by Lukas
In this Hero Talk episode, I had the pleasure of chatting with Ben Rogojan, better known as the "Seattle Data Guy." Ben is a data engineer, YouTuber, ...
#117 We Are Starting a Recruiting Service!
18 Oct 2024
Contributed by Lukas
In this episode of the Plumbers of Data Science podcast, I'm sharing some exciting updates about the future of Learn Data Engineering and a big ne...
#116 Data Modeling is F***ing Easy!
23 Sep 2024
Contributed by Lukas
In this episode of the Plumbers of Data Science podcast, I’m sharing my thoughts on why data modeling isn’t as complicated as people make it out t...
#115 His Career Started With a Bootcamp & Now He Helps Others Succeed - Hero Talk w/ Mezue Obi-Eyis
20 Sep 2024
Contributed by Lukas
In this Hero Talk episode, I talk with Mezue, a seasoned Data Engineer with expertise in Azure Databricks Data Engineering. We cover his journey from ...
#114 Dirty Data & Data Cleaning - Hero Talk with "The Classification Guru" Susan Walsh
16 Sep 2024
Contributed by Lukas
In this Hero Talk episode, I chat with Susan Walsh, the “Classification Guru,” known for her expertise in cleaning and classifying messy data.We d...
#113 A Deep Dive Into APIs, IoT, and Data Storage - Hero Talk with Paolo Lulli
09 Sep 2024
Contributed by Lukas
In this Hero Talk episode, I sit down with Paolo Lulli, an experienced Data Engineer, to explore some of the core challenges and decisions in API deve...
#112 Why testing data pipelines can be so challenging - and how to tackle it
06 Sep 2024
Contributed by Lukas
In this episode of the Plumbers of Data Science podcast, I’m diving into why testing can be so challenging for data engineers. The inspiration for t...
#111 Is This the Synthetic Data Revolution?! Hero Talk with Mario Scriminaci from Mostly AI
02 Sep 2024
Contributed by Lukas
In this Hero Talk episode, we dive deep into the fascinating world of synthetic data, a critical tool for development, testing, and training Machine L...
#110 Bootcamps vs Coaching
30 Aug 2024
Contributed by Lukas
In this episode of the Plumbers of Data Science podcast, I’m diving into the debate between bootcamps and coaching programs, especially for those lo...
#109 Why your data and goals matter more than tools!
23 Aug 2024
Contributed by Lukas
In this episode of the Plumbers of Data Science podcast, I’m diving into what truly matters when building data platforms and pipelines.As engineers,...
#108 Why Apache Spark Is Such An Essential Skill - Hero Talk with Philipp Brunenberg
19 Aug 2024
Contributed by Lukas
In this episode, we explore the essentials of learning and mastering Apache Spark. Joining me is Philip, an experienced Spark developer and educator, ...
#107 The Future of Data Observability - Hero Talk with Ryan Yackel
12 Aug 2024
Contributed by Lukas
In this Hero Talk episode, we explore the crucial topic of data observability, a field that has become essential for Data Engineers dealing with compl...
#106 Should You Move to Germany for a Data Engineering Career?
09 Aug 2024
Contributed by Lukas
In this episode of the Plumbers of Data Science podcast, I’m breaking down the real deal of working as a data engineer in Germany. Does it live up t...
#105 Personal Branding in Data - Hero Talk with Kate Strachnyi
05 Aug 2024
Contributed by Lukas
In this Hero Talk episode, we delve into the fascinating world of personal branding in data with our special guest, Kate Strachnyi, founder of DataCat...
#104 The Secret Why Time Series Databases Are Awesome - Hero Talk with Jeff Tao
02 Aug 2024
Contributed by Lukas
In this Hero Talk episode, we explore the dynamic world of time series data and time series databases with a special guest, Jeff Tao, founder and CEO ...
#103 From India to the U.S.: Becoming a Data Engineer at Toyota - Hero Talk with Ayan Tiwari
29 Jul 2024
Contributed by Lukas
In this Hero Talk episode, we dive into the inspiring journey of Ayan Tiwari, a Data Engineer at Toyota North America.Join us as Ayan shares his remar...
#102 Data Tools & Platforms: Why you should always be skeptical
26 Jul 2024
Contributed by Lukas
In this episode of the Plumbers of Data Science podcast, we explore why you should be skeptical of data platforms and tools. Using a LEGO Grogu set fr...
#101 GenAI from a Data Engineer's perspective - Hero Talk with Vinoth Nageshwaran
22 Jul 2024
Contributed by Lukas
The ultimate place to learn Data Engineering: learndataengineering.com
#100 Why Excel should be a go-to tool for data professionals
19 Jul 2024
Contributed by Lukas
The ultimate place to learn Data Engineering: learndataengineering.com
#99 Real Talk on GenAI & Large Language Models - Hero Talk with Harpreet Sahota
15 Jul 2024
Contributed by Lukas
In this Hero Talk episode we dive into the exciting and evolving world of Generative AI and Large Language Models (LLMs) with a special guest, Harpree...
#98 Are Job Guarantees a Scam?
12 Jul 2024
Contributed by Lukas
The ultimate place to learn Data Engineering: learndataengineering.com
#97 Data Science Career AMA! - Hero Talk with Andrew Jones
05 Jul 2024
Contributed by Lukas
In this Hero Talk episode we dive into the world of Data Science careers with a special Ask Me Anything (AMA) session.Join me as I welcome Andrew Jone...
#96 Can GenAI be trusted?
28 Jun 2024
Contributed by Lukas
The ultimate place to learn Data Engineering: learndataengineering.com
#95 The Perfect CV for Switching Careers
21 Jun 2024
Contributed by Lukas
The ultimate place to learn Data Engineering: learndataengineering.com
#94 - Making less money to set yourself up for success?!
14 Jun 2024
Contributed by Lukas
The ultimate place to learn Data Engineering: learndataengineering.com
#93 Is the highest paying job the best?
07 Jun 2024
Contributed by Lukas
The ultimate place to learn Data Engineering: learndataengineering.com
#92 Is it impossible to get a Data Engineering job as a fresher?
31 May 2024
Contributed by Lukas
The ultimate place to learn Data Engineering: learndataengineering.com
#91 A New Beginning & All Successful Students Have This:
31 May 2024
Contributed by Lukas
Starting up the podcast for another session :)
#90 Taylor McGrath - The Future of the Modern Data Stack
25 Jan 2023
Contributed by Lukas
Super happy to have Taylor with me on this stream. She is the VP of Data Labs at Rivery and therefore has a lot of experience with data platforms. We'...
#89 Piyush Sachdeva - Getting Into Google After Eight Rejections from Amazon!
16 Jan 2023
Contributed by Lukas
In this video I talk to Piyush who's an engineer at Google and has his own YouTube channel: "Tech Tutorials with Piyush". He's a really good guy and I...
#88 - Wouter Trappers - How to Realize a Data Strategy Like a Pro!
12 Apr 2022
Contributed by Lukas
I have seen people doing that wrong a few times. Luckily Wouter Trappers who is helping companies as a professional can help. We talked about The step...
#87 - Dhruba Borthakur - From Hadoop to real time analytics
12 Apr 2022
Contributed by Lukas
Dhruba Borthakur is CTO at Rockset and a passionate Data Engineer. Before co-founding Rockset he played a big role in development of Hadoop HDFS at Ya...
#86 The Ultimate Data Engineering Introduction
14 Jan 2021
Contributed by Lukas
The Podcast is back!!!! I promise I am going to keep it up to date this time ;) In this episode I talk about my newest Data Engineering course. I thin...
#085 Big Data and Data Science Landscape plus trying to read Tweets with Nifi
28 May 2019
Contributed by Lukas
We are looking into the network communication protocol map. I first saw this like 10 years ago and its awesome. Then we check out the Big Data a...
#084 Behind the scenes: Audio podcast, free transcriptions and GitHub
27 May 2019
Contributed by Lukas
Today's podcast is a bit of a behind the scenes. What it takes to do a audio podcast. How you can get audio to text transcriptions for free.&nbs...
#083 Data Engineering at OLX Case Study
27 May 2019
Contributed by Lukas
Today a case study about OLX with a guest it was super fun! Here are the slides Alexeyand I talked about: https://www.slideshare.net/mobile/AlexeyGrig...
#082 Reading Tweets With Apache Nifi & IaaS vs PaaS vs SaaS
27 May 2019
Contributed by Lukas
In this episode we install the Nifi docker container and look into how we can extract the twitter data. We are also talking about the differences betw...
#081 How to get tweets from the Twitter API
27 May 2019
Contributed by Lukas
In this episode we look into the Twitter API documentation, which I love by the way. How can we get old tweets for a certain hashtags and how to get c...
#080 How To Find A Job In Germany & Answering Mails
27 May 2019
Contributed by Lukas
Tips on how you find a job in Germany and two super interesting mails.
#079 Trying to stay true to myself and making the cookbook public on GitHub
27 May 2019
Contributed by Lukas
The cookbook my Youtube, it will be for free, forever! Check out the data engineering cookbook on GitHub: https://github.com/andkret/Cookbook
#078 Cookbook collaboration and updates
27 May 2019
Contributed by Lukas
Updates of the cookbook and how to collaborate on it
#077 Lambda and Kappa Architecture
27 May 2019
Contributed by Lukas
In this episode we talk about the lambda architecture with stream and batch processing as well as a alternative the Kappa Architecture that consists o...
#076 Cloud vs On Premise How To Decide
27 May 2019
Contributed by Lukas
How do you choose between Cloud vs On-Premise, pros and cons and what you have to think about. Because there are good reasons to not go cloud. Also th...
#075 Creating the Course Structure For My Data Engineering Course
27 May 2019
Contributed by Lukas
In this episode we go over the ideas I have for the data engineering course structure. It was your chance for you to influence what we put in there.
#074 Starting My Data Engineering Online Course
27 May 2019
Contributed by Lukas
In this video we go over some of the 100+ comments I received on LinkedIn about a data engineering training.
#073 Data Engineering At LinkedIn Case Study
27 May 2019
Contributed by Lukas
Let's check out how LinkedIn is processing data
#072 Data Engineering At Twitter Case Study
27 May 2019
Contributed by Lukas
How is Twitter doing Data Engineering? Oh man, they have a lot of cool things to share these tweets.
#071 Data Engineering At Spotify Case Study
27 May 2019
Contributed by Lukas
In this episode we are looking at the data engineering at Spotify, my favorite music streaming service. How do they process all that data?
#070 The Engineering Culture At Spotify
27 May 2019
Contributed by Lukas
In this podcast we look at the engineering culture at Spotify, my favorite music streaming service. The process behind the development of Spotif...
#069 Data Engineering At Pinterest Case Study
27 May 2019
Contributed by Lukas
A look into how Pinterest is doing data engineering.
#068 A Budget Data Science PC Build
27 May 2019
Contributed by Lukas
Configuring a sub 1000 dollar PC for data engineering and machine learning Link to the builds: 900$ build: https://pcpartpicker.com/list/22ThcY 1...
#067 Data Engineering At NASA Case Study
27 May 2019
Contributed by Lukas
A look into how NASA is doing data engineering.
#066 How To Do Data Science From A Data Engineers Perspective
27 May 2019
Contributed by Lukas
A simple introduction how to do data science in the context of the internet of things.
#065 Data Engineering At CERN Case Study
27 May 2019
Contributed by Lukas
A look into how CERN is doing Data Engineering. They get huge amounts of data from the Large Hydron Colider. Let's check it out.
#064 Data Engineering At Booking.com Case Study
27 May 2019
Contributed by Lukas
A look into how booking.com is doing data engineering.
#063 Data Engineering At Airbnb Case Study
27 May 2019
Contributed by Lukas
A look into how Airbnb is doing Data Engineering.
#062 Data Engineering At Netflix Case Study
27 May 2019
Contributed by Lukas
How Netflix is doing Data Engineering using their Keystone platform
#061 Reworking My Cookbook For Data Engineering
27 May 2019
Contributed by Lukas
I decided to rework the cookbook focusing more on case studies and less on explaining tools. People keep asking me for a path to become a data enginee...
#060 What Is Hadoop And Is Hadoop Still Relevant In 2019?
27 May 2019
Contributed by Lukas
A Introduction into Hadoop HDFS, YARN and MapReduce. Yes, Hadoop is still relevant in 2019 even if you look into serverless tools.
#059 A Look Into The Siemens Mindsphere IoT Platform? | #059
27 May 2019
Contributed by Lukas
The Internet of things is a huge deal. There are many platforms available. But, which one is actually good? Join me on a 50 minute dive into the Sieme...
#058 Guitars And Data Live Stream
27 May 2019
Contributed by Lukas
A stream full of mediocre guitar playing and great Q&A about Hadoop.
#057 Introducing The Plumbers Medium Publication
27 May 2019
Contributed by Lukas
I have created a Medium Publication especially for us Plumbers of Data Science who work in Data Engineering and Big Data. It's called, you guessed it,...
#056 NoSQL Key Value Stores Explained With HBase
27 May 2019
Contributed by Lukas
What is the difference between SQL and NoSQL? In this episode I show you on the example of HBase how a key/value store works.
#055 Data Warehouse vs Data Lake
27 May 2019
Contributed by Lukas
On this podcast I talk about data warehouses and data lakes. When do people use which? What are the pros and cons of both? Architecture examples for b...
#054 How to Market Yourself in 2019 Student or Professional
27 May 2019
Contributed by Lukas
In this episode I talk about how you can gain a competitive edge on the job market. It's super simple, you can and should start with it TODAY by putti...
#053 The Data Science Depression Is Coming? What You Can Do
27 May 2019
Contributed by Lukas
The Data Science Hype is still strong. Where's the industry going, towards a cliff? Here's what can you do?
#052 Data Engineering Cookbook Live Stream
27 May 2019
Contributed by Lukas
In this episode I show you the first version of my data engineering cookbook.
#051 Five Books To Buy As A Data Engineer & My Book Buying Strategy
27 May 2019
Contributed by Lukas
Getting a book and reading it cover to cover is useless. In this episode I show you my strategy of buying books complimentary to your work. And 5 grea...
#050 Data Engineer Scientist or Analyst Which One Is For You?
27 May 2019
Contributed by Lukas
In this podcast we talk about the differences between data scientists, analysts and engineers. Which are the three main data science jobs. All three s...
#049 I Found A REAL Use For Blockchain, At Least I thought So
27 May 2019
Contributed by Lukas
After all the BS solutions using Blockchain I thought I finally found one that makes sense. Of all the possibilities it's the EU data protection law G...
#048 From Wannabe Data Scientist To Engineer My Journey
27 May 2019
Contributed by Lukas
In this episode Kate Strachnyi interviews me for her humans of data science podcast. We talk about how I found out that I am more into the engineering...
#047 The Truth About Data Science Salary For Graduates
27 May 2019
Contributed by Lukas
In this episode I show you how much data science graduates are actually payed in Germany. All over the internet you can find that Data Science salary ...
#046 How To Use GitHub for LaTeX Version Control
27 May 2019
Contributed by Lukas
In this podcast I am showing you how I use GitHub to write my Data Engineering Cookbook with LaTex.
#045 Why I Use LaTeX to Write Professionally And You Should Too
07 Dec 2018
Contributed by Lukas
What is the best editing tool to write a thesis, a dissertation or a paper? NOT Word or Pages! It's LaTeX. In today's video I show you why I decided t...
#044 How to Increase Your Chances for Internships or a Full-time Job
27 Nov 2018
Contributed by Lukas
You have certifications or a university degree, but can't find a job? Sharing your ideas and knowledge will increase your chances! Here's how you can...
#041 Agile Development Is Important But Please Don't Do Scrum
18 Oct 2018
Contributed by Lukas
I love agile development. People keep telling you to do Scrum, like it's the only and best choice to be agile. It's not. Here's my take on scrum and m...
#040 Huge Big Data News! Cloudera and Hortonworks Merge
09 Oct 2018
Contributed by Lukas
So, Cloudera and Hortonworks merge... In today's Plumbers of Data Science Podcast I talk about what these, big data vendors do. How they enable compan...
#039 Is ETL Dead For Data Science and Big Data?
03 Oct 2018
Contributed by Lukas
Is ETL dead in Data Science and Big Data? In today's podcast I share with you my views on your questions regarding ETL (extract, transform, load). Da...
#38 Morning advice to beginner Data Scientists and Data Engineers
27 Sep 2018
Contributed by Lukas
What's the difference between Data Scientists & Data Analysts? What to do to find internships or a full time job? Data Scientist and Engineer in large...
#037 How To Boost Teamwork With Version Control
12 Sep 2018
Contributed by Lukas
Without the proper tools and techniques of version control the team's efficiency goes down the drain. In this episode I talk about how tools like Jira...
#036 Why Distributed Processing Is Super Important
10 Sep 2018
Contributed by Lukas
You need to become comfortable with distributed processing. Data Science or the Internet of Things, the amount of data that is getting produced and pr...
#035 Learning By Doing Is The Best Thing Ever!
06 Sep 2018
Contributed by Lukas
For me, school and university was hard. The lectures, sitting down and getting told how things work. Reading books and learning dry stuff was a drag....
#034 Talent Stacks For Data Engineers
04 Sep 2018
Contributed by Lukas
Becoming an expert in single skill is not the way to go for a data engineer. In this episode I talk about which talents go good together in terms of t...
#033 How APIs Rule The World
03 Sep 2018
Contributed by Lukas
Strong APIs make a good platform. In this episode I talk about why you need APIs and why Twitter is a great example. Especially JSON APIs are my perso...
#032 How to Design Security Zones and Lambda Architecture
30 Aug 2018
Contributed by Lukas
Security is everything! That's why today, I took some time to give you some tips about how to make a good design. The Lambda Architecture with stream ...
#031 IT Networking Infrastructure and Linux
29 Aug 2018
Contributed by Lukas
The understanding of how information is transported over the network is super important. OS wise you will mostly encounter Linux so here are some impo...
#030 Why the hardware and the GPU is super important
28 Aug 2018
Contributed by Lukas
Knowing the hardware is super important for a data engineer. Even if you are using cloud servers. CPU, RAM, GPU, HDD, SSD... Especially the GPU is a ...
#029 A New Mission
27 Aug 2018
Contributed by Lukas
I am bringing the Podcast back! Lets call it season 2. New name, new mission: Helping you become a data engineer. Daily podcast, recorded in my car or...
4 Vs Of Big Data Are Enough!
23 May 2018
Contributed by Lukas
8 V's, 10 V's, 12 V's . The best way to explain Big Data is to use the four V's: Volume, Velocity, Variety and Veracity. In this podcast episode I ...
Why Companies Badly Need Data Scientists And Engineers
18 May 2018
Contributed by Lukas
In this episode I give you my take on why companies badly need data scientists and engineers. Because in this data driven world, you can accomplish a ...
What You Need To Know About Data Engineering
16 May 2018
Contributed by Lukas
This podcast is all about what you as a data engineer really do. From building platforms to collaboration with data scientists and customers. Everyth...
I'm a Big Data Engineer and it's Super Awesome!
15 May 2018
Contributed by Lukas
There is this other data science job called data engineer and it's super important. Because data science does not equal data scientist. In today's pod...