Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing
Podcast Image

DataTopics: All Things Data, AI & Tech

#83 Who’s Minding the Metadata? Why Data Quality Matters in GenAI (Quality Time With Paolo)

11 Apr 2025

Description

Send us a textWelcome to the cozy corner of the tech world where ones and zeros mingle with casual chit-chat. Datatopics Unplugged is your go-to spot for relaxed discussions around tech, news, data, and society.Dive into conversations that should flow as smoothly as your morning coffee (but don't), where industry insights meet laid-back banter. Whether you're a data aficionado or just someone curious about the digital age, pull up a chair, relax, and let's get into the heart of data, unplugged style!In this episode, host Murilo is joined by returning guest Paolo, Data Management Team Lead at dataroots, for a deep dive into the often-overlooked but rapidly evolving domain of unstructured data quality. Tune in for a field guide to navigating documents, images, and embeddings without losing your sanity.What we unpack:Data management basics: Metadata, ownership, and why Excel isn’t everything.Structured vs unstructured data: How the wild west of PDFs, images, and audio is redefining quality.Data quality challenges for LLMs: From apples and pears to rogue chatbots with “legally binding” hallucinations.Practical checks for document hygiene: Versioning, ownership, embedding similarity, and tagging strategies.Retrieval-Augmented Generation (RAG): When ChatGPT meets your HR policies and things get weird.Monitoring and governance: Building systems that flag rot before your chatbot gives out 2017 vacation rules.Tooling and gaps: Where open source is doing well—and where we’re still duct-taping workflows.Real-world inspirations: A look at how QuantumBlack (McKinsey) is tackling similar issues with their AI for DQ framework.

Audio
Featured in this Episode

No persons identified in this episode.

Transcription

This episode hasn't been transcribed yet

Help us prioritize this episode for transcription by upvoting it.

0 upvotes
🗳️ Sign in to Upvote

Popular episodes get transcribed faster

Comments

There are no comments yet.

Please log in to write the first comment.