Platform Engineering Playbook Podcast
Episodes
The Hidden Kubernetes Tax Costing Teams $43,800 a Year
30 Mar 2026
Contributed by Lukas
**Is your company bleeding $43,800 annually on hidden Kubernetes costs?** Most platform teams have no idea they're paying this "invisible tax" – but...
Your Kubernetes Stack Is Why AI Isn’t Shipping
27 Mar 2026
Contributed by Lukas
**Why do 87% of AI models never reach production? It's not the AI - it's the infrastructure underneath.** In this deep dive episode of Platform Engine...
AI Agents in Kubernetes Need Standards — Before Everything Breaks
26 Mar 2026
Contributed by Lukas
**What happens when AI agents in your Kubernetes cluster start making their own scaling decisions without proper guardrails?** In this episode of Plat...
AI Agents Are About to Break Kubernetes — Unless We Standardize Now
25 Mar 2026
Contributed by Lukas
What happens when hundreds of AI agents start running in your Kubernetes cluster but can't communicate with each other? By 2026, this isn't a hypothet...
How to Monitor LLMs in Production Before They Drain Your Budget
24 Mar 2026
Contributed by Lukas
**Are you burning through your LLM budget with zero visibility into why?** You're not alone - 73% of production deployments are facing this exact prob...
Helm Security Is Broken. WebAssembly Fixes It.
23 Mar 2026
Contributed by Lukas
**What if 94% of Helm chart vulnerabilities could be prevented with one unexpected technology?** Today's Platform Engineering Playbook dives deep into...
The Kubernetes AI Pattern That Cuts GPU Costs
20 Mar 2026
Contributed by Lukas
**87% of AI workloads are sitting idle on GPUs right now** - yet companies keep buying more hardware. What if the problem isn't capacity, but how we'r...
You’re Monitoring the Wrong Kubernetes Metrics
19 Mar 2026
Contributed by Lukas
**Are 73% of Kubernetes clusters really flying blind?** According to recent industry reports, most K8s deployments are drowning in meaningless metrics...
The AI Security Hole Your Red Team Is Missing
18 Mar 2026
Contributed by Lukas
**87% of enterprise AI deployments have a critical security vulnerability that red teams aren't even testing for.** Are you one of them? In today's Pl...
Your Kubernetes Monitoring Is Blind to AI Attacks
17 Mar 2026
Contributed by Lukas
**Is your Kubernetes cluster blind to AI model poisoning attacks?** 73% of companies running AI workloads can't detect when their models are compromis...
The 6 Types of AI Cloud Infrastructure
16 Mar 2026
Contributed by Lukas
**87% of AI companies are burning cash on the wrong cloud infrastructure - and they have no idea.** In this episode of Platform Engineering Playbook, ...
Why AI Code Is Killing Your Monitoring Budget
13 Mar 2026
Contributed by Lukas
**Is your monitoring bill about to explode? AI-generated code is creating 10x more observability data than human-written code.** In this deep dive epi...
How Karpenter Fixes Kubernetes Autoscaling
12 Mar 2026
Contributed by Lukas
**Are you throwing money away on Kubernetes compute costs?** 87% of clusters waste up to half their resources on idle nodes - but there's a solution t...
AI Is Not the Problem — Your Infrastructure Is
11 Mar 2026
Contributed by Lukas
**Why do 70% of AI projects crash and burn before they ever see production?** Spoiler alert: it's not the AI that's broken. In today's Platform Engine...
Why Kubernetes Doesn’t Scale Without an IDP
10 Mar 2026
Contributed by Lukas
**Why do 97% of companies using Kubernetes never scale beyond their original expert team?** It's not a skills problem - it's an architecture problem t...
The AWS Cost That Doesn’t Show Up in Cost Explorer
09 Mar 2026
Contributed by Lukas
**What if your AWS bill has a hidden line item costing you thousands that doesn't even show up in Cost Explorer?** Today on Platform Engineering Playb...
87% of Ansible Playbooks Are Broken (AI Just Proved It)
06 Mar 2026
Contributed by Lukas
**87% of production Ansible playbooks have critical flaws - but AI just revealed how to fix them.** Today's Platform Engineering Playbook dives deep i...
GrafanaCON 2026: The Agenda That Signals the Future of Observability
05 Mar 2026
Contributed by Lukas
**GrafanaCON 2026 just dropped their agenda, and every attendee will build an AI agent from scratch on day one. What does this tell us about the futur...
Can AI Run Your Production Systems?
04 Mar 2026
Contributed by Lukas
What if your observability stack could debug and fix production issues while you sleep? That future might be closer than you think. In today's Platfor...
Claude Went Down. The API Didn’t. Here’s Why.
03 Mar 2026
Contributed by Lukas
What happens when a major AI platform goes dark while secretly pursuing billion-dollar government contracts? Claude's massive outage reveals critical ...
Backstage Is Becoming the Control Plane for Engineering
02 Mar 2026
Contributed by Lukas
**What if Spotify's secret weapon for managing 2,800 microservices could transform your entire platform engineering strategy?** Today's Platform Engin...
The End of ingress-nginx: Kubernetes Migration Guide Before 2026
27 Feb 2026
Contributed by Lukas
**70% of Kubernetes clusters will go dark in March 2026 when ingress-nginx support officially ends. Are you ready?** Today's Platform Engineering Play...
Claude Code Remote Control Changes Developer Workflows
26 Feb 2026
Contributed by Lukas
**What if 87% of developer productivity loss just became a thing of the past?** Anthropic's Claude Computer Use capability is reshaping how platform...
Databricks Lakebase vs Postgres: The AI Database Shift
25 Feb 2026
Contributed by Lukas
**Is PostgreSQL really obsolete for AI workloads?** Databricks just dropped Lakebase and it's shaking up everything we thought we knew about database ...
How to Secure AI Agents with MCP, OPA & Ephemeral Runners
24 Feb 2026
Contributed by Lukas
**Your AI agents have root access to your infrastructure right now - and you don't even know it.** What happens when we give AI agents the keys to our...
Cloudflare Takes Down the Internet Again — With a Config Change
23 Feb 2026
Contributed by Lukas
**What happens when a single configuration change takes down 20% of the internet for six hours?** In this episode of Platform Engineering Playbook, we...
The Next Platform Engineer: AI + Observability + FinOps
20 Feb 2026
Contributed by Lukas
**Is AI about to revolutionize how we build infrastructure? The CNCF CTO says we're not prepared for what's coming.** In this episode of Platform Engi...
Ray + Kubernetes: The Production AI Stack Explained
19 Feb 2026
Contributed by Lukas
**Why do 92% of ML models never reach production?** It's not a code problem—it's a platform engineering problem. In today's episode of Platform Engi...
Replace 5 Databases with 1? SurrealDB for AI Agents Explained
18 Feb 2026
Contributed by Lukas
Your AI agents are using five different databases right now - and you don't even know it. This database sprawl is silently killing your platform's per...
Agoda’s API Agent Turns Any API into MCP — No Code, No Deployments
17 Feb 2026
Contributed by Lukas
**What if API integration nightmares could disappear without writing a single line of code?** Agoda just dropped a game-changing solution that transfo...
LocalStack Kills Community Edition: What Breaks in March
16 Feb 2026
Contributed by Lukas
**LocalStack just killed their open-source edition - but what does this really mean for your platform engineering stack?** In today's episode of Platf...
OpenTofu vs Terraform: What Enterprise Teams Are Actually Doing (2026)
13 Feb 2026
Contributed by Lukas
**Is your infrastructure strategy about to become obsolete?** By 2025, half of all Terraform installations could be running OpenTofu - and the implica...
Why Databases Inside Kubernetes Are Becoming Technical Debt
12 Feb 2026
Contributed by Lukas
**Is running databases in Kubernetes about to become legacy technical debt overnight?** By 2026, the inference cloud revolution is forcing platform en...
47% of CNCF Projects Slowed Down in 2025 — Why That’s Actually Good News
11 Feb 2026
Contributed by Lukas
**Why did 47% of CNCF projects slow down their development velocity in 2025 — and why platform engineers should celebrate this trend?** In today's P...
The Claude Skills That Stop AI From Writing Dangerous Infrastructure as Code
10 Feb 2026
Contributed by Lukas
**Are 87% of DevOps teams unknowingly creating security vulnerabilities with AI-generated infrastructure code?** Today's Platform Engineering Playbook...
Docker vs Nix: Why Your Builds Aren’t Actually Reproducible
09 Feb 2026
Contributed by Lukas
97% of Docker containers can't reproduce the exact same build six months later—what does this mean for platform engineering, and why should you care...
The Data Canary Pattern: How Netflix Prevents Bad Metadata Deploys
07 Feb 2026
Contributed by Lukas
**What happens when 2 billion daily metadata events could crash Netflix's entire platform with one bad transformation?** Today's Platform Engineering ...
Claude Opus 4.6: The First AI That Feels Like a Teammate
06 Feb 2026
Contributed by Lukas
**Claude Opus 4.6 just demolished GPT-4 on every coding benchmark - and it's about to reshape how we think about platform engineering automation.** In...
Autonomous AI in DevOps Is Here — And Most Teams Are Doing It Wrong
05 Feb 2026
Contributed by Lukas
**Will 87% of DevOps teams really be obsolete by 2026?** As AI agents take control of production infrastructure, we're witnessing the biggest transfor...
Kubernetes Is Retiring Ingress NGINX (And 50% of Clusters Aren’t Ready)
04 Feb 2026
Contributed by Lukas
"90% of Kubernetes clusters are running Ingress NGINX—abandoned in 16 months with zero maintainers left! What does this mean for your production sys...
OpenAI’s New macOS App: Is Agentic Coding Finally Here?
03 Feb 2026
Contributed by Lukas
**OpenAI just made 73% of coding assistants obsolete overnight - but what does this mean for platform engineers?** Today's episode breaks down OpenAI'...
98% of Container CVEs Are Hiding Where You’re Not Scanning
02 Feb 2026
Contributed by Lukas
**Are your container security scans missing 98% of critical vulnerabilities?** New research from Chainguard reveals a shocking blind spot that could b...
Why Forward-Deployed Engineers Are Making $300K+ (And Why Companies Are Desperate for Them)
31 Jan 2026
Contributed by Lukas
Why are forward-deployed engineers making 40% more than traditional backend developers, and why can't companies hire enough of them? In today's Platfo...
AWS DevOps Agent in Production: What Most Teams Get Wrong
30 Jan 2026
Contributed by Lukas
**Why do 73% of AWS DevOps Agent deployments crash and burn in their first week?** It's not what you think. In this episode of Platform Engineering Pl...
AI Agents Are Rewriting the SRE Playbook (For Better or Worse)
29 Jan 2026
Contributed by Lukas
What if AI agents could flip the script on SRE work, turning 87% of firefighting into 87% prevention? That's exactly what's happening in the "agentic ...
DevOps Is Dead — Platform Engineering Replaced It
28 Jan 2026
Contributed by Lukas
**DevOps is dead - and the companies that created it are the ones pulling the trigger.** But what's replacing it might be the most significant shift i...
47 Countries Went Offline — What Platform Engineers Must Learn From It
27 Jan 2026
Contributed by Lukas
**What happens when 47 countries lose internet access in just 3 months—and it's not cyberattacks?** Today's Platform Engineering Playbook dives deep...
Two Missing Characters Nearly Compromised AWS’s Supply Chain
26 Jan 2026
Contributed by Lukas
**What if two missing characters could compromise every AWS-managed GitHub repository?** That's exactly what happened in a critical regex vulnerabilit...
Kubernetes Just Became Essential for AI Growth (CNCF Report)
25 Jan 2026
Contributed by Lukas
**Why will 90% of AI workloads fail without Kubernetes in the next 18 months?** Most platform teams are walking into a disaster they can't see coming....
ChatGPT Scales PostgreSQL to power 800 million users
24 Jan 2026
Contributed by Lukas
OpenAI is running ChatGPT for ~800 million users on PostgreSQL — and according to their own disclosures, it’s actually working. In this episode...
3 Skills You Need to Transition to Platform Engineer
23 Jan 2026
Contributed by Lukas
**Will 70% of DevOps engineers disappear in the next 5 years?** That's the bold prediction kicking off today's deep dive into the massive career shift...
The Infrastructure Monitoring Tools Teams Regret Choosing
22 Jan 2026
Contributed by Lukas
The monitoring tool everyone trusts is actually blind to 40% of your infrastructure failures—and the vendor knows it. Are you using an industry stan...
Your CI/CD Pipeline is a Debt Trap
21 Jan 2026
Contributed by Lukas
**73% of engineering teams are drowning in technical debt because of their CI/CD pipelines. Not despite them—because of them.** Are your automation ...
Kubernetes Just Revolutionized Learning — Get Ahead Now!
20 Jan 2026
Contributed by Lukas
**Are major tech companies secretly abandoning Kubernetes certifications?** What we discovered about the future of K8s learning will change how you ap...
How AWS's New Euro Cloud Changes Data Control Forever
19 Jan 2026
Contributed by Lukas
"92% of European companies don’t trust US cloud providers with their data anymore. So, AWS just locked itself out of its own Euro Cloud! This shocki...
Why Pulumi's New Move Could Change Terraform Forever
18 Jan 2026
Contributed by Lukas
Terraform’s biggest competitor just made a move that could redefine infrastructure-as-code in 2026. Pulumi now runs Terraform and HCL natively—bet...
Astro Joins Cloudflare: What It Means for Platform Engineers
17 Jan 2026
Contributed by Lukas
Cloudflare acquires the Astro Technology Company, adding a 1M-downloads-per-week web framework to their edge platform. We analyze the strategic implic...
ScyllaDB X Cloud Challenges DynamoDB Cost and Performance
16 Jan 2026
Contributed by Lukas
ScyllaDB just launched X Cloud with claims of double the performance at half the cost compared to DynamoDB. This episode breaks down the technical arc...
Invisible Linux Malware: The Undetectable Threat to Your Cloud Infrastructure
15 Jan 2026
Contributed by Lukas
Your Linux servers aren't just running containers anymore—they're hosting invisible tenants that security teams can't even detect. In this episode, ...
The AI-Cloud Native Symbiosis - How Intelligent Infrastructure is Transforming Platform Engineering
14 Jan 2026
Contributed by Lukas
By 2025, 90% of new enterprise applications will be AI-powered and cloud-native. This episode explores the symbiotic relationship between AI and Kuber...
MIT 10 Breakthrough Technologies 2026 - The Platform Engineering Perspective
13 Jan 2026
Contributed by Lukas
MIT just released their 10 Breakthrough Technologies for 2026 - and three of them are infrastructure problems that platform engineers are solving righ...
AWS Route 53 Global Resolver - Enterprise DNS Security at the Edge
12 Jan 2026
Contributed by Lukas
Every DNS query your hybrid environment makes could be exposing sensitive data. AWS Route 53 Global Resolver, announced at re:Invent 2025, combines an...
Kubernetes Upcoming Features Deep Dive - Extended Toleration Operators and Mutable PV Node Affinity
11 Jan 2026
Contributed by Lukas
There's a Kubernetes cluster out there right now burning ten thousand dollars a month on GPU nodes that sit idle sixty percent of the time. Why? Becau...
Why Is a 2016 AWS Instance Still the Best Value? (Cloudspecs Research)
10 Jan 2026
Contributed by Lukas
New research from TUM reveals uncomfortable truths about cloud hardware stagnation. The paper "Cloudspecs: Cloud Hardware Evolution Through the Lookin...
Iran IPv6 Blackout - When Governments Weaponize Protocol Transitions
09 Jan 2026
Contributed by Lukas
The same IPv6 transition your infrastructure team has been procrastinating on is now being weaponized by governments. On January 8, 2026, Iran's IPv6 ...
Venezuela BGP Anomaly - Deep Technical Analysis
08 Jan 2026
Contributed by Lukas
A deep technical dive into the January 2026 Venezuela BGP route leak incident. Was it a cyberattack? The technical evidence says no - and that's actua...
HolmesGPT: AI Root Cause Analysis for Kubernetes
08 Jan 2026
Contributed by Lukas
Deep dive into HolmesGPT, the CNCF Sandbox AI agent that revolutionizes cloud-native troubleshooting. This episode covers what it is, its 40+ integrat...
Docker Kanvas: Infrastructure as Design
07 Jan 2026
Contributed by Lukas
Docker just launched Kanvas, a visual tool that turns your architecture diagrams into deployable infrastructure. Built on Meshery (CNCF's 6th highest-...
Remote MCP Architecture - Running AI Tool Servers on Kubernetes
06 Jan 2026
Contributed by Lukas
The MCP server registry hit 10,000+ integrations, but most teams are running these servers on laptops. This episode breaks down the production archite...
AWS DevOps Agent - Promises vs Reality
05 Jan 2026
Contributed by Lukas
AWS launched DevOps Agent at re:Invent 2025 as an "autonomous on-call engineer." But before you cancel your PagerDuty subscription, we separate market...
AWS Graviton5: 192 Cores, 5x Cache - ARM Takes Over the Data Center
04 Jan 2026
Contributed by Lukas
AWS doubled the core count on their flagship ARM processors with Graviton5—192 cores in a single socket, 5x L3 cache (180MB), and 3nm fabrication. W...
Can OpenTelemetry Save Observability in 2026?
03 Jan 2026
Contributed by Lukas
OpenTelemetry has won the instrumentation wars with 95% adoption predicted for 2026. But winning data collection doesn't solve observability's real pr...
When Serverless Fails: Unkey's 6x Performance Migration to Containers
02 Jan 2026
Contributed by Lukas
Why did an API key management platform abandon edge serverless for stateful containers? Unkey hit 30ms p99 cache latency when they needed sub-10ms—s...
From Alert Fatigue to Signal-Driven Ops: The Observability Shift
01 Jan 2026
Contributed by Lukas
Why do 73% of organizations experience outages from alerts they ignored? This episode breaks down the technical shift from reactive thresholds to SLO-...
Security Ops Specialty: The Underrated Skill Every Platform Engineer Needs in 2026
31 Dec 2025
Contributed by Lukas
Platform engineers who understand security operations—secrets management, vulnerability scanning, and compliance automation—are commanding premium...
Agentic AI Foundation - MCP and the Future of AI-Native Platform Engineering
30 Dec 2025
Contributed by Lukas
The Linux Foundation announced the Agentic AI Foundation (AAIF) on December 9, 2025, bringing together AWS, Anthropic, Google, Microsoft, OpenAI, Bloc...
FinOps 2026 for Platform Engineers: The Complete Skills Guide
29 Dec 2025
Contributed by Lukas
FinOps is becoming an essential skill for platform engineers in 2026. This episode provides a complete guide to the skills, certifications, and tools ...
Platform Engineering Salary Report 2026: Skills That Pay
28 Dec 2025
Contributed by Lukas
Platform engineers are commanding $172K-$207K in 2026, a 13-27% premium over DevOps roles. This episode breaks down salary benchmarks from Dice, Motio...
Platform Engineering 2026 Predictions Roundup (Platform Engineering 2026 Look Forward Series - Part 5/5)
27 Dec 2025
Contributed by Lukas
The series finale of our five-part Platform Engineering 2026 Look Forward Series. We synthesize everything from agentic AI operations, mainstream adop...
Kubernetes Enters the Boring Era (Platform Engineering 2026 Look Forward Series - Part 4/5)
26 Dec 2025
Contributed by Lukas
The best thing happening to Kubernetes in 2026 is that it's becoming boring. After a decade of explosive innovation, Kubernetes is entering its "matur...
Developer Experience Metrics Beyond DORA (Platform Engineering 2026 Look Forward Series - Part 3/5)
24 Dec 2025
Contributed by Lukas
DORA metrics revolutionized how we measure DevOps performance, but they have a critical blind spot: they tell you how your delivery pipeline is perfor...
Platform Engineering Goes Mainstream in 2026 (Platform Engineering 2026 Look Forward Series - Part 2/5)
23 Dec 2025
Contributed by Lukas
Episode 2 of our 5-part "Platform Engineering 2026 Look Forward Series" examines the macro trend: platform engineering crossing the chasm to mainstrea...
Agentic AI Transforms Platform Operations in 2026 (Platform Engineering 2026 Look Forward Series - Part 1/5)
22 Dec 2025
Contributed by Lukas
Episode 1 of our 5-part "Platform Engineering 2026 Look Forward Series" tackles the hottest debate in platform engineering: will AI agents replace us ...
CNPE (Certified Cloud Native Platform Engineer) Certification Study Guide
21 Dec 2025
Contributed by Lukas
The CNPE (Certified Cloud Native Platform Engineer) exam launched November 11, 2025 at KubeCon Atlanta, becoming the first hands-on platform engineeri...
Kubernetes 1.35 Timbernetes Deep Dive: Breaking Changes, In-Place Resize GA, Gang Scheduling
20 Dec 2025
Contributed by Lukas
Kubernetes 1.35 "Timbernetes" dropped on December 17, 2025, fundamentally changing how we operate clusters. This deep dive covers the 60 enhancements,...
Terraform Stacks + Native Monorepo Support: HashiCorp's Answer to IaC Complexity
20 Dec 2025
Contributed by Lukas
No more copy-paste configs. No more manual state management. Terraform just went component-based. HashiCorp released native monorepo support and Terra...
95% Fewer CVEs, $0 Cost: Docker Just Open-Sourced Enterprise Security
19 Dec 2025
Contributed by Lukas
Supply chain attacks cost $60 billion in 2025. Docker just made the solution free. On December 17, Docker released 1,000+ hardened container images un...
Kubernetes 1.35 "Timbernetes" - The End of the Pod Restart Era
18 Dec 2025
Contributed by Lukas
Kubernetes 1.35 is here, and it changes everything about pod lifecycle management. In this episode, we break down the release that finally lets you sc...
40,000x Fewer Deployment Failures: How Netflix Adopted Temporal
17 Dec 2025
Contributed by Lukas
Netflix reduced their deployment failures by 40,000x using Temporal. In this episode, we break down how they achieved this remarkable improvement and ...
Kubernetes: Helm vs Crossplane vs kro (Honest Comparison)
16 Dec 2025
Contributed by Lukas
48% of Kubernetes users struggle with tool choice. That's nearly half of us paralyzed by options. So when AWS adopted kro alongside Argo CD, we had to...
Platform Engineering 2025 Year in Review
15 Dec 2025
Contributed by Lukas
2025 was the year platform engineering grew up—and got a reality check. AI entered infrastructure in ways we couldn't ignore. Industry consensus fin...
Okta's GitOps Journey - Scaling ArgoCD from 12 to 1,000 Clusters
14 Dec 2025
Contributed by Lukas
In five years, Okta scaled Auth0's private cloud from 12 to 1,000+ Kubernetes clusters using ArgoCD. At KubeCon 2025, engineers Jérémy Albuixech and...
Platform Engineering Team Structures That Work
13 Dec 2025
Contributed by Lukas
Ninety percent of organizations now have platform teams, but most just renamed their ops team and expected different results. This episode breaks down...
CDKTF Deprecated - The End of HashiCorp's Programmatic IaC Experiment
12 Dec 2025
Contributed by Lukas
HashiCorp (now IBM) has officially archived the CDK for Terraform project, ending a five-year experiment in programmatic infrastructure-as-code. Full ...
stern v1.33.1 - Listen to the Docs with AudioDocs
11 Dec 2025
Contributed by Lukas
🎧 AUDIODOCS: Official documentation of popular open-source projects, adapted and narrated for audio. Learn while commuting, exercising, or doing ch...
CoreDNS v1.13.1 - Listen to the Docs with AudioDocs
11 Dec 2025
Contributed by Lukas
🎧 AUDIODOCS: Official documentation of popular open-source projects, adapted and narrated for audio. Learn while commuting, exercising, or doing ch...
kubectx & kubens v0.9.5 - Listen to the Docs with AudioDocs
11 Dec 2025
Contributed by Lukas
🎧 AUDIODOCS: Official documentation of popular open-source projects, adapted and narrated for audio. Learn while commuting, exercising, or doing ch...
AWS re:Invent 2025 Recap 4/4 - Data & AI Wrap-Up
11 Dec 2025
Contributed by Lukas
Part 4 of 4 in our AWS re:Invent 2025 series (finale). The data and AI services that tie everything together. S3 Tables with Apache Iceberg hits GA wi...
AWS re:Invent 2025 Recap Part 3/4 - EKS & Cloud Operations
10 Dec 2025
Contributed by Lukas
Part 3 of our AWS re:Invent 2025 series. AWS transforms Kubernetes into an AI infrastructure platform with massive scale and AI-native operations. In ...
AWS re:Invent 2025 Part 2/4 - Infrastructure & Developer Experience
09 Dec 2025
Contributed by Lukas
AWS re:Invent 2025 Series (Part 2 of 4) AWS announces Graviton5 with 192 cores (3x previous gen) and 40% better price-performance vs x86. Trainium 3 d...