Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing
Podcast Image

Data Engineering Podcast

Technology Education

Episodes

Showing 1-100 of 494
Page 1 of 5 Next → »»

From Context to Semantics: How Metadata Powers Agentic AI

21 Dec 2025

Contributed by Lukas

Summary In this episode Suresh Srinivas and Sriharsha Chintalapani explore how metadata platforms are evolving from human-centric catalogs into t...

From Data Engineering to AI Engineering: Where the Lines Blur

14 Dec 2025

Contributed by Lukas

Summary In this solo episode of the Data Engineering Podcast, host Tobias Macey reflects on how AI has transformed the practice and pace of data ...

Malloy: Hierarchical Data, Semantic Models, and the Future of Analytics

08 Dec 2025

Contributed by Lukas

Summary In this episode Michael Toy, co-creator of Malloy, talks about rethinking how we work with data beyond SQL. Michael shares the origins of...

Blurring Lines: Data, AI, and the New Playbook for Team Velocity

24 Nov 2025

Contributed by Lukas

SummaryIn this crossover episode, Max Beauchemin explores how multiplayer, multi‑agent engineering is transforming the way individuals and teams bui...

State, Scale, and Signals: Rethinking Orchestration with Durable Execution

16 Nov 2025

Contributed by Lukas

Summary In this episode Preeti Somal, EVP of Engineering at Temporal, talks about the durable execution model and how it reshapes the way teams b...

The AI Data Paradox: High Trust in Models, Low Trust in Data

09 Nov 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Ariel Pohoryles, head of product marketing for Boomi's data management offerings, talks about a...

Bridging the AI–Data Gap: Collect, Curate, Serve

02 Nov 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Omri Lifshitz (CTO) and Ido Bronstein (CEO) of Upriver talk about the growing gap between AI's ...

Beyond the Perimeter: Practical Patterns for Fine‑Grained Data Access

27 Oct 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Matt Topper, president of UberEther, talks about the complex challenge of identity, credentials...

The True Costs of Legacy Systems: Technical Debt, Risk, and Exit Strategies

18 Oct 2025

Contributed by Lukas

SummaryIn this episode Kate Shaw, Senior Product Manager for Data and SLIM at SnapLogic, talks about the hidden and compounding costs of maintaining l...

Context Engineering as a Discipline: Building Governed AI Analytics

11 Oct 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast, host Tobias Macey welcomes back Nick Schrock, CTO and founder of Dagster Labs, to discuss Comp...

The Data Model That Captures Your Business: Metric Trees Explained

05 Oct 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Vijay Subramanian, founder and CEO of Trace, talks about metric trees - a new approach to data ...

From GPUs-as-a-Service to Workloads-as-a-Service: Flex AI’s Path to High-Utilization AI Infra

28 Sep 2025

Contributed by Lukas

SummaryIn this crossover episode of the AI Engineering Podcast, host Tobias Macey interviews Brijesh Tripathi, CEO of Flex AI, about revolutionizing A...

From RAG to Relational: How Agentic Patterns Are Reshaping Data Architecture

18 Sep 2025

Contributed by Lukas

SummaryIn this episode of the AI Engineering Podcast Mark Brooker, VP and Distinguished Engineer at AWS, talks about how agentic workflows are transfo...

Duck Lake: Simplifying the Lakehouse Ecosystem

10 Sep 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Hannes Mühleisen and Mark Raasveldt, the creators of DuckDB, share their work on Duck Lake, a ...

Aligning Business and Data: The Essential Role of Data Modeling

01 Sep 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Serge Gershkovich, head of product at SQL DBM, talks about the socio-technical aspects of data ...

From Academia to Industry: Bridging Data Engineering Challenges

26 Aug 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Professor Paul Groth, from the University of Amsterdam, talks about his research on knowledge g...

High Performance And Low Overhead Graphs With KuzuDB

18 Aug 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Prashanth Rao, an AI engineer at KuzuDB, talks about their embeddable graph database. Prashanth...

Bridging Data and Decision-Making: AI's Role in Modern Analytics

12 Aug 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Lucas Thelosen and Drew Gilson from Gravity talk about their development of Orion, an autonomou...

From Bits to Tables: The Evolution of S3 Storage

05 Aug 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Andy Warfield talks about the innovative functionalities of S3 Tables and Vectors and their int...

Revolutionizing Python Notebooks with Marimo

28 Jul 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Akshay Agrawal from Marimo discusses the innovative new Python notebook environment, which offe...

Warehouse Native Incremental Data Processing With Dynamic Tables And Delayed View Semantics

21 Jul 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Dan Sotolongo from Snowflake talks about the complexities of incremental data processing in war...

Streamlining Data Pipelines with MCP Servers and Vector Engines

15 Jul 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Kacper Łukawski from Qdrant about integrating MCP servers with vector databases to process uns...

Foundational Data Engineering At Two Sigma

06 Jul 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Effie Baram, a leader in foundational data engineering at Two Sigma, talks about the complexiti...

Enabling Agents In The Enterprise With A Platform Approach

29 Jun 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Arun Joseph talks about developing and implementing agent platforms to empower businesses with ...

Dagster's New Era: Modularizing Data Transformation in the Age of AI

18 Jun 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast we welcome back Nick Schrock, CTO and founder of Dagster Labs, to discuss the evolving landscap...

AI and the Lakehouse: How Starburst is Pioneering New Workflows

11 Jun 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Alex Albu, tech lead for AI initiatives at Starburst, talks about integrating AI workloads with...

Amazon S3: The Backbone of Modern Data Systems

03 Jun 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Mai-Lan Tomsen Bukovec, Vice President of Technology at AWS, talks about the evolution of Amazo...

Scaling Data Operations With Platform Engineering

29 May 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Chakravarthy Kotaru talks about scaling data operations through standardized platform offerings...

From Data Discovery to AI: The Evolution of Semantic Layers

21 May 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast, host Tobias Macy welcomes back Shinji Kim to discuss the evolving role of semantic layers in t...

Balancing Off-the-Shelf and Custom Solutions in Data Engineering

13 May 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Tulika Bhatt, a senior software engineer at Netflix, talks about her experiences with large-sca...

StarRocks: Bridging Lakehouse and OLAP for High-Performance Analytics

05 May 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Sida Shen, product manager at CelerData, talks about StarRocks, a high-performance analytical d...

Exploring NATS: A Multi-Paradigm Connectivity Layer for Distributed Applications

28 Apr 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Derek Collison, creator of NATS and CEO of Synadia, talks about the evolution and capabilities ...

Advanced Lakehouse Management With The LakeKeeper Iceberg REST Catalog

21 Apr 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Viktor Kessler, co-founder of Vakmo, talks about the architectural patterns in the lake house e...

Simplifying Data Pipelines with Durable Execution

12 Apr 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Jeremy Edberg, CEO of DBOS, about durable execution and its impact on designing and implementin...

Overcoming Redis Limitations: The Dragonfly DB Approach

30 Mar 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Roman Gershman, CTO and founder of Dragonfly DB, explores the development and impact of high-sp...

Bringing AI Into The Inner Loop of Data Engineering With Ascend

24 Mar 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Sean Knapp, CEO of Ascend.io, explores the intersection of AI and data engineering. He discusse...

Astronomer's Role in the Airflow Ecosystem: A Deep Dive with Pete DeJoy

16 Mar 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Pete DeJoy, co-founder and product lead at Astronomer, talks about building and managing Airflo...

Accelerated Computing in Modern Data Centers With Datapelago

08 Mar 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Rajan Goyal, CEO and co-founder of Datapelago, talks about improving efficiencies in data proce...

The Future of Data Engineering: AI, LLMs, and Automation

26 Feb 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Gleb Mezhanskiy, CEO and co-founder of DataFold, talks about the intersection of AI and data en...

Evolving Responsibilities in AI Data Management

16 Feb 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Bartosz Mikulski talks about preparing data for AI applications. Bartosz shares his journey fro...

CSVs Will Never Die And OneSchema Is Counting On It

13 Jan 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Andrew Luo, CEO of OneSchema, talks about handling CSV data in business operations. Andrew shar...

Breaking Down Data Silos: AI and ML in Master Data Management

03 Jan 2025

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Dan Bruckner, co-founder and CTO of Tamr, talks about the application of machine learning (ML) ...

Building a Data Vision Board: A Guide to Strategic Planning

23 Dec 2024

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Lior Barak shares his insights on developing a three-year strategic vision for data management....

How Orchestration Impacts Data Platform Architecture

16 Dec 2024

Contributed by Lukas

SummaryThe core task of data engineering is managing the flows of data through an organization. In order to ensure those flows are executing on schedu...

An Exploration Of The Impediments To Reusable Data Pipelines

08 Dec 2024

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast the inimitable Max Beauchemin talks about reusability in data pipelines. The conversation explo...

The Art of Database Selection and Evolution

01 Dec 2024

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Sam Kleinman talks about the pivotal role of databases in software engineering. Sam shares his ...

Bridging Code and UI in Data Orchestration with Kestra

26 Nov 2024

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast, Anna Geller talks about the integration of code and UI-driven interfaces for data orchestratio...

Streaming Data Into The Lakehouse With Iceberg And Trino At Going

18 Nov 2024

Contributed by Lukas

In this episode, I had the pleasure of speaking with Ken Pickering, VP of Engineering at Going, about the intricacies of streaming data into a Trino a...

An Opinionated Look At End-to-end Code Only Analytical Workflows With Bruin

11 Nov 2024

Contributed by Lukas

SummaryThe challenges of integrating all of the tools in the modern data stack has led to a new generation of tools that focus on a fully integrated w...

Feldera: Bridging Batch and Streaming with Incremental Computation

04 Nov 2024

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast, the creators of Feldera talk about their incremental compute engine designed for continuous co...

Accelerate Migration Of Your Data Warehouse with Datafold's AI Powered Migration Agent

27 Oct 2024

Contributed by Lukas

SummaryGleb Mezhanskiy, CEO and co-founder of DataFold, joins Tobias Macey to discuss the challenges and innovations in data migrations. Gleb shares h...

Bring Vector Search And Storage To The Data Lake With Lance

20 Oct 2024

Contributed by Lukas

SummaryThe rapid growth of generative AI applications has prompted a surge of investment in vector databases. While there are numerous engines availab...

The Role of Python in Shaping the Future of Data Platforms with DLT

13 Oct 2024

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast, Adrian Broderieux and Marcin Rudolph, co-founders of DLT Hub, delve into the principles guidin...

Build Your Data Transformations Faster And Safer With SDF

06 Oct 2024

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast Lukas Schulte, co-founder and CEO of SDF, explores the development and capabilities of this fas...

Scaling Airbyte: Challenges and Milestones on the Road to 1.0

23 Sep 2024

Contributed by Lukas

SummaryAirbyte is one of the most prominent platforms for data movement. Over the past 4 years they have invested heavily in solutions for scaling the...

Enhancing Data Accessibility and Governance with Gravitino

01 Sep 2024

Contributed by Lukas

SummaryAs data architectures become more elaborate and the number of applications of data increases, it becomes increasingly challenging to locate and...

The Evolution of DataOps: Insights from DataKitchen's CEO

04 Aug 2024

Contributed by Lukas

SummaryIn this episode of the Data Engineering Podcast, host Tobias Macey welcomes back Chris Berg, CEO of DataKitchen, to discuss his ongoing mission...

Achieving Data Reliability: The Role of Data Contracts in Modern Data Management

28 Jul 2024

Contributed by Lukas

SummaryData contracts are both an enforcement mechanism for data quality, and a promise to downstream consumers. In this episode Tom Baeyens returns t...

How Generative AI Is Impacting Data Engineering Teams

21 Jul 2024

Contributed by Lukas

SummaryGenerative AI has rapidly gained adoption for numerous use cases. To support those applications, organizational data platforms need to add new ...

The Role of Product Managers in Data-Centric Organizations

13 Jul 2024

Contributed by Lukas

SummaryIn this episode Praveen Gujar, Director of Product at LinkedIn, talks about the intricacies of product management for data and analytical platf...

Neon: A Serverless And Developer Friendly Postgres

08 Jul 2024

Contributed by Lukas

SummaryPostgres is one of the most widely respected and liked database engines ever. To make it even easier to use for developers to use, Nikita Shamg...

Improve Data Quality Through Engineering Rigor And Business Engagement With Synq

30 Jun 2024

Contributed by Lukas

SummaryThis episode features an insightful conversation with Petr Janda, the CEO and founder of Synq. Petr shares his journey from being an engineer t...

Stitching Together Enterprise Analytics With Microsoft Fabric

23 Jun 2024

Contributed by Lukas

Summary Data lakehouse architectures have been gaining significant adoption. To accelerate adoption in the enterprise Microsoft has created the Fab...

Being Data Driven At Stripe With Trino And Iceberg

16 Jun 2024

Contributed by Lukas

Summary Stripe is a company that relies on data to power their products and business. To support that functionality they have invested in Trino and...

X-Ray Vision For Your Flink Stream Processing With Datorios

09 Jun 2024

Contributed by Lukas

Summary Streaming data processing enables new categories of data products and analytics. Unfortunately, reasoning about stream processing engines i...

Practical First Steps In Data Governance For Long Term Success

02 Jun 2024

Contributed by Lukas

Summary Modern businesses aspire to be data driven, and technologists enjoy working through the challenge of building data systems to support that ...

Data Migration Strategies For Large Scale Systems

27 May 2024

Contributed by Lukas

Summary Any software system that survives long enough will require some form of migration or evolution. When that system is responsible for the dat...

Zenlytic Is Building You A Better Coworker With AI Agents

19 May 2024

Contributed by Lukas

Summary The purpose of business intelligence systems is to allow anyone in the business to access and decode data to help them make informed decisi...

Release Management For Data Platform Services And Logic

12 May 2024

Contributed by Lukas

Summary Building a data platform is a substrantial engineering endeavor. Once it is running, the next challenge is figuring out how to address rele...

Barking Up The Wrong GPTree: Building Better AI With A Cognitive Approach

05 May 2024

Contributed by Lukas

SummaryArtificial intelligence has dominated the headlines for several months due to the successes of large language models. This has prompted numerou...

Build Your Second Brain One Piece At A Time

28 Apr 2024

Contributed by Lukas

SummaryGenerative AI promises to accelerate the productivity of human collaborators. Currently the primary way of working with these tools is through ...

Making Email Better With AI At Shortwave

21 Apr 2024

Contributed by Lukas

Summary Generative AI has rapidly transformed everything in the technology sector. When Andrew Lee started work on Shortwave he was focused on maki...

Designing A Non-Relational Database Engine

14 Apr 2024

Contributed by Lukas

Summary Databases come in a variety of formats for different use cases. The default association with the term "database" is relational en...

Establish A Single Source Of Truth For Your Data Consumers With A Semantic Layer

07 Apr 2024

Contributed by Lukas

Summary Maintaining a single source of truth for your data is the biggest challenge in data engineering. Different roles and tasks in the business ...

Adding Anomaly Detection And Observability To Your dbt Projects Is Elementary

31 Mar 2024

Contributed by Lukas

Summary Working with data is a complicated process, with numerous chances for something to go wrong. Identifying and accounting for those errors is...

Ship Smarter Not Harder With Declarative And Collaborative Data Orchestration On Dagster+

24 Mar 2024

Contributed by Lukas

Summary A core differentiator of Dagster in the ecosystem of data orchestration is their focus on software defined assets as a means of building de...

Reconciling The Data In Your Databases With Datafold

17 Mar 2024

Contributed by Lukas

Summary A significant portion of data workflows involve storing and processing information in database engines. Validating that the information is ...

Version Your Data Lakehouse Like Your Software With Nessie

10 Mar 2024

Contributed by Lukas

Summary Data lakehouse architectures are gaining popularity due to the flexibility and cost effectiveness that they offer. The link that bridges th...

When And How To Conduct An AI Program

03 Mar 2024

Contributed by Lukas

Summary Artificial intelligence technologies promise to revolutionize business and produce new sources of value. In order to make those promises a ...

Find Out About The Technology Behind The Latest PFAD In Analytical Database Development

25 Feb 2024

Contributed by Lukas

Summary Building a database engine requires a substantial amount of engineering effort and time investment. Over the decades of research and develo...

Using Trino And Iceberg As The Foundation Of Your Data Lakehouse

18 Feb 2024

Contributed by Lukas

Summary A data lakehouse is intended to combine the benefits of data lakes (cost effective, scalable storage and compute) and data warehouses (user...

Data Sharing Across Business And Platform Boundaries

11 Feb 2024

Contributed by Lukas

Summary Sharing data is a simple concept, but complicated to implement well. There are numerous business rules and regulatory concerns that need to...

Tackling Real Time Streaming Data With SQL Using RisingWave

04 Feb 2024

Contributed by Lukas

Summary Stream processing systems have long been built with a code-first design, adding SQL as a layer on top of the existing framework. RisingWave...

Build A Data Lake For Your Security Logs With Scanner

29 Jan 2024

Contributed by Lukas

Summary Monitoring and auditing IT systems for security events requires the ability to quickly analyze massive volumes of unstructured log data. Th...

Modern Customer Data Platform Principles

22 Jan 2024

Contributed by Lukas

Summary Databases and analytics architectures have gone through several generational shifts. A substantial amount of the data that is being managed...

Pushing The Limits Of Scalability And User Experience For Data Processing WIth Jignesh Patel

07 Jan 2024

Contributed by Lukas

Summary Data processing technologies have dramatically improved in their sophistication and raw throughput. Unfortunately, the volumes of data that...

Designing Data Platforms For Fintech Companies

01 Jan 2024

Contributed by Lukas

Summary Working with financial data requires a high degree of rigor due to the numerous regulations and the risks involved in security breaches. In...

Troubleshooting Kafka In Production

24 Dec 2023

Contributed by Lukas

Summary Kafka has become a ubiquitous technology, offering a simple method for coordinating events and data across different systems. Operating it ...

Adding An Easy Mode For The Modern Data Stack With 5X

18 Dec 2023

Contributed by Lukas

Summary The "modern data stack" promised a scalable, composable data platform that gave everyone the flexibility to use the best tools fo...

Run Your Own Anomaly Detection For Your Critical Business Metrics With Anomstack

11 Dec 2023

Contributed by Lukas

Summary If your business metrics looked weird tomorrow, would you know about it first? Anomaly detection is focused on identifying those outliers f...

Designing Data Transfer Systems That Scale

04 Dec 2023

Contributed by Lukas

Summary The first step of data pipelines is to move the data to a place where you can process and prepare it for its eventual purpose. Data transfe...

Addressing The Challenges Of Component Integration In Data Platform Architectures

27 Nov 2023

Contributed by Lukas

Summary Building a data platform that is enjoyable and accessible for all of its end users is a substantial challenge. One of the core complexities...

Unlocking Your dbt Projects With Practical Advice For Practitioners

20 Nov 2023

Contributed by Lukas

Summary The dbt project has become overwhelmingly popular across analytics and data engineering teams. While it is easy to adopt, there are many po...

Enhancing The Abilities Of Software Engineers With Generative AI At Tabnine

13 Nov 2023

Contributed by Lukas

Summary Software development involves an interesting balance of creativity and repetition of patterns. Generative AI has accelerated the ability of...

Shining Some Light In The Black Box Of PostgreSQL Performance

06 Nov 2023

Contributed by Lukas

Summary Databases are the core of most applications, but they are often treated as inscrutable black boxes. When an application is slow, there is a...

Surveying The Market Of Database Products

30 Oct 2023

Contributed by Lukas

Summary Databases are the core of most applications, whether transactional or analytical. In recent years the selection of database products has ex...

Defining A Strategy For Your Data Products

23 Oct 2023

Contributed by Lukas

Summary The primary application of data has moved beyond analytics. With the broader audience comes the need to present data in a more approachable...

Reducing The Barrier To Entry For Building Stream Processing Applications With Decodable

15 Oct 2023

Contributed by Lukas

Summary Building streaming applications has gotten substantially easier over the past several years. Despite this, it is still operationally challe...

Using Data To Illuminate The Intentionally Opaque Insurance Industry

09 Oct 2023

Contributed by Lukas

Summary The insurance industry is notoriously opaque and hard to navigate. Max Cho found that fact frustrating enough that he decided to build a bu...

Building ETL Pipelines With Generative AI

01 Oct 2023

Contributed by Lukas

Summary Artificial intelligence applications require substantial high quality data, which is provided through ETL pipelines. Now that AI has reache...

Page 1 of 5 Next → »»