Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing
Podcast Image

Platform Engineering Playbook Podcast

Technology

Episodes

Showing 101-149 of 149
«« ← Prev Page 2 of 2

AWS re:Invent 2025 Part 1/4 - The Agentic AI Revolution

08 Dec 2025

Contributed by Lukas

AWS announces autonomous AI agents that can work for days without human intervention. The DevOps Agent is an always-on incident responder. The Securit...

Developer Experience Metrics Beyond DORA

07 Dec 2025

Contributed by Lukas

DORA metrics revolutionized how we measure DevOps performance, but are we missing the bigger picture? This episode explains DORA from the ground up—...

Cloudflare's Trust Crisis - December 2025 Outage and the Human Cost

06 Dec 2025

Contributed by Lukas

Three weeks after their worst outage since 2019, Cloudflare went down again. On December 5, 2025, a Lua code bug took down 28% of HTTP traffic for 25 ...

Cloud Cost Quick Wins for Year-End

05 Dec 2025

Contributed by Lukas

Global cloud spend hits $720 billion in 2025—and organizations waste 20-30% on unused resources. Year-end is the perfect time to show savings before...

Platform Engineering vs DevOps vs SRE - The Identity Crisis

04 Dec 2025

Contributed by Lukas

Platform Engineer roles pay 20% more than DevOps Engineer roles, but job descriptions are 90% identical. Is Platform Engineering just DevOps with bett...

Platform Engineering Certification Tier List 2025

03 Dec 2025

Contributed by Lukas

Are certifications worth it? The answer is: it depends. And that's precisely the problem. In this episode, Jordan and Alex rank 25+ certifications usi...

Kubernetes AI Conformance - The End of AI Infrastructure Chaos

02 Dec 2025

Contributed by Lukas

The Wild West of AI infrastructure just ended. CNCF launched the Certified Kubernetes AI Conformance Program at KubeCon Atlanta on November 11, 2025. ...

Helm 4 - The Definitive Guide to the Biggest Update in 6 Years

01 Dec 2025

Contributed by Lukas

Helm 4.0 dropped at KubeCon Atlanta 2025, marking the biggest update in 6 years. Server-Side Apply finally ends the GitOps ownership wars. WASM plugin...

CNPE Certification Guide - The First Platform Engineering Credential

30 Nov 2025

Contributed by Lukas

CNCF just launched the first-ever hands-on platform engineering certification at KubeCon Atlanta 2025. But with beta testers reporting 29% scores, is ...

10 Platform Engineering Anti-Patterns That Kill Developer Productivity

29 Nov 2025

Contributed by Lukas

DORA 2024 found organizations with platform teams saw throughput decrease by 8% and stability decrease by 14%. Wait—isn't platform engineering suppo...

Black Friday War Stories: Lessons from E-Commerce's Worst Days

28 Nov 2025

Contributed by Lukas

Why do major retailers with unlimited budgets still crash on Black Friday? This episode dives into the graveyard of e-commerce outages—from J.Crew's...

Giving Thanks to Your Dependencies: A Platform Engineer's Gratitude Guide

27 Nov 2025

Contributed by Lukas

This Thanksgiving, let's talk about the people you've never thanked. 60% of open source maintainers are unpaid. 60% have left or considered leaving. Y...

KubeCon Atlanta 2025 Part 3: Community at 10 Years - The Sustainability Question

26 Nov 2025

Contributed by Lukas

CNCF celebrates 10 years with 300,000 contributors and 230+ projects—but the hallway track told a different story. 60% of maintainers unpaid. 60% ha...

KubeCon Atlanta 2025 Part 2: Platform Engineering Consensus and Community Reality Check

25 Nov 2025

Contributed by Lukas

After years of "what even IS platform engineering" debates, KubeCon 2025 delivered consensus: three non-negotiable principles, real-world adoption at ...

KubeCon 2025 Part 1: AI Goes Native and the 30K Core Lesson

24 Nov 2025

Contributed by Lukas

Google donates a GPU driver live on stage. OpenAI saves $2.16M/month with one line of code. Kubernetes rollback finally works after 10 years. What cha...

The $4,350/Month GPU Waste Problem: How Kubernetes Architecture Creates Massive Cost Inefficiency

23 Nov 2025

Contributed by Lukas

Your H100 costs $5,000 per month, but you're only using it at 13% capacity—wasting $4,350 monthly per GPU. Analysis of 4,000+ Kubernetes clusters re...

Service Mesh Showdown: Why User-Space Beat eBPF

22 Nov 2025

Contributed by Lukas

Kernel-level eBPF should beat user-space proxies—but Istio Ambient delivers 8% mTLS overhead while Cilium shows 99%. Academic benchmarks reveal why ...

The Terraform vs OpenTofu Debate - Why "Just Switch" Is Bad Advice

21 Nov 2025

Contributed by Lukas

HashiCorp's license change and IBM's $6.4B acquisition created the "you must migrate" narrative—but 70% of teams using Terraform in-house aren't leg...

Agentic DevOps: GitHub Agent HQ and the Autonomous Pipeline Revolution

20 Nov 2025

Contributed by Lukas

GitHub Universe 2025 announced Agent HQ—mission control for orchestrating AI agents from OpenAI, Anthropic, Google, and more. Azure SRE Agent saved ...

Cloudflare Outage November 2025: When a Rust Panic Took Down 20% of the Internet

19 Nov 2025

Contributed by Lukas

A routine database permissions change triggered Cloudflare's worst outage since 2019—taking down ChatGPT, X, Shopify, Discord, and 20% of the intern...

Ingress NGINX Retirement: The March 2026 Migration Deadline

19 Nov 2025

Contributed by Lukas

The de facto standard Kubernetes ingress controller will stop receiving security patches in March 2026—and only 1-2 people have been maintaining it ...

OpenTelemetry eBPF Instrumentation: Zero-Code Observability Under 2% Overhead

18 Nov 2025

Contributed by Lukas

What if you could achieve complete observability coverage—every HTTP request, database query, and gRPC call—without touching application code? Jor...

The Open Source Observability Showdown: When "Free" Costs $12K/Month

17 Nov 2025

Contributed by Lukas

Prometheus is free, Grafana is free, Loki is free—yet Datadog posted $2.3B in revenue and Shopify runs a 15-person team just to manage their observa...

The Kubernetes Complexity Backlash: When Simpler Infrastructure Wins

16 Nov 2025

Contributed by Lukas

Kubernetes commands 92% market share, yet 88% report year-over-year cost increases and 25% plan to shrink deployments. We unpack the 3-5x cost underes...

SRE Reliability Principles: The 26% Problem - Error Budgets, SLOs, Platform Engineering

16 Nov 2025

Contributed by Lukas

Only 26% of organizations actively use SLOs after a decade of Google's SRE principles being gospel. We explore why adoption is so low despite 49% sayi...

Internal Developer Portal Showdown 2025: Backstage vs Port vs Cortex vs OpsLevel

14 Nov 2025

Contributed by Lukas

Your team spent 6 months implementing Backstage. Adoption? 8%. The CFO asks: "Why didn't we buy a solution?" Here's the 2025 comparison with real pric...

DNS for Platform Engineering: The Silent Killer

13 Nov 2025

Contributed by Lukas

Why does a forty-year-old protocol keep taking down billion-dollar infrastructure? The October 2025 AWS outage lasted fifteen hours because of a DNS r...

eBPF in Kubernetes: Kernel-Level Superpowers Without the Risk

12 Nov 2025

Contributed by Lukas

Your Kubernetes cluster is a black box—Prometheus shows symptoms, not causes. eBPF turns the Linux kernel into a programmable platform for observabi...

Time Series Language Models

11 Nov 2025

Contributed by Lukas

AI models that can read your infrastructure metrics like language, explain anomalies in plain English, and predict failures without training on your d...

Title: Kubernetes IaC & GitOps - The Workflow Paradox

11 Nov 2025

Contributed by Lukas

77% of organizations have adopted GitOps, 60% run ArgoCD—yet platform teams are still bottlenecks and deployments still take days. Jordan and Alex i...

The FinOps AI Paradox: Why Smart Tools Don't Cut Costs (And What Actually Does)

09 Nov 2025

Contributed by Lukas

Your company spent $500K on AI-powered FinOps tools. The AI identified $3M in potential savings. Ninety days later, you've implemented $180K—just 6%...

The DevOps Toolchain Crisis: Why Adding Tools Makes Teams Slower

08 Nov 2025

Contributed by Lukas

Your team spent $500K on productivity tools. So why are engineers slower than last year? Jordan and Alex unpack the hidden crisis: 75% of teams lose 1...

Kubernetes Production Mastery Lesson 3: Health Checks & Probes

07 Nov 2025

Contributed by Lukas

Learn how to configure Kubernetes health checks that prevent production outages. This episode covers the three types of probes (liveness, readiness, s...

Kubernetes Production Mastery Lesson 3: Security Foundations - RBAC & Secrets

06 Nov 2025

Contributed by Lukas

RBAC misconfiguration is the number one Kubernetes security vulnerability. Learn how to implement namespace-scoped RBAC roles, secure secrets manageme...

The Cloud Repatriation Debate: When AWS Costs 10-100x More Than It Should

05 Nov 2025

Contributed by Lukas

An in-depth analysis of cloud repatriation economics, examining real companies saving millions by leaving AWS. Jordan and Alex discuss 37signals' $2M ...

Kubernetes in 2025: The Maturity Paradox

04 Nov 2025

Contributed by Lukas

Kubernetes has 92% market share, but "do we actually need this?" is the loudest conversation in platform engineering. This episode explores the maturi...

Backstage in Production: The 10% Adoption Problem

03 Nov 2025

Contributed by Lukas

Your team spent 9 months implementing Backstage. The portal looks beautiful. But internal adoption? 8%. Spotify's VP of Engineering has publicly ackno...

Platform Engineering ROI Calculator: Prove Value to Executives

30 Oct 2025

Contributed by Lukas

45% of platform teams measure nothing and get disbanded when they can't prove ROI. Jordan and Alex break down the exact ROI calculation framework that...

Why 70% of Platform Engineering Teams Fail (And the 5 Metrics That Predict Success)

28 Oct 2025

Contributed by Lukas

60-70% of platform engineering teams fail to deliver impact, with 45% disbanded within 18 months. We investigate why technically excellent teams with ...

Lesson 02: Resource Management - Kubernetes Production Mastery

28 Oct 2025

Contributed by Lukas

Your pods keep getting OOMKilled at the worst possible times. In this lesson, you'll master the difference between requests and limits, understand the...

Kubernetes Production Mastery - Lesson 01: Production Mindset

27 Oct 2025

Contributed by Lukas

Transform from a Kubernetes user into a production engineer. Learn the mental shift from development to production, identify the 5 failure patterns th...

GCP State of the Union 2025 - When Depth Beats Breadth

26 Oct 2025

Contributed by Lukas

GCP grows at 32% while AWS manages 17%—nearly 2x faster despite having half the services. We break down why Google's depth-over-breadth strategy is ...

The $75 Million Per Hour Lesson: Inside the 2025 AWS us-east-1 Outage

25 Oct 2025

Contributed by Lukas

October 19, 2025. 11:48 PM: A DNS race condition in DynamoDB took down 70 AWS services for 14 hours, affecting 1,000+ companies and costing $75M/hour....

AWS State of the Union 2025 - Navigate 200+ Services with Strategic Clarity

24 Oct 2025

Contributed by Lukas

AWS has over 200 services, but which 20 actually matter for your platform? We cut through the documentation maze to give you strategic service selecti...

Platform Tools Tier List

23 Oct 2025

Contributed by Lukas

Which platform engineering skills command $24,000+ higher salaries? We analyze 220+ tools from the Dice 2025 Tech Salary Report, break down the commod...

Same App: $41 on Railway vs $1,010 on Vercel - The Real Cost of 'Simple' PaaS

22 Oct 2025

Contributed by Lukas

Everyone promises Heroku-like simplicity with cloud-scale performance, but which PaaS actually delivers? We break down real-world costs for identical ...

130 Tools, 20% Utilization, $71K/Year Lost Per Engineer - The Platform Sprawl Tax

21 Oct 2025

Contributed by Lukas

Enterprise teams manage 130+ tools but only use 10-20% of their capabilities. Engineers juggle 16 monitoring tools on average—40 when SLAs get stric...

Cloud Providers in 2025 - Platform Abstractions, GPU Dynamics, and the New Multi-Cloud Reality

20 Oct 2025

Contributed by Lukas

AWS still dominates at 32% market share, but new deployments tell a different story. Platform abstractions (Vercel, Fly.io, Railway) mean developers n...

75% of Your Team Uses Unauthorized AI - Why Your Blocking Strategy Backfires

19 Oct 2025

Contributed by Lukas

85% of organizations are facing a crisis: employees adopting AI tools 890% faster than IT can assess them. The "just block it" approach fails 100% of ...

«« ← Prev Page 2 of 2