Imagine this paradox: compute power for training AI models is growing 4× every year, yet the pool of high-quality data barely grows by 3%. The result? For the first time, it’s not hardware but data that has become the biggest bottleneck for large language models.In this episode, we explore what this shift means for the future of AI. Why do standard scaling approaches—like just making models bigger or endlessly reusing limited datasets—actually backfire? And more importantly, what algorithmic tricks let us squeeze every drop of performance from scarce data?We dive into:Why classic scaling laws (like Chinchilla) break down under fixed datasets.How cranking up regularization (30× higher than standard!) prevents overfitting.Why ensembles of models outperform even an “infinitely large” single model—and how just three models together can beat the theoretical maximum of one giant.How knowledge distillation turns unwieldy ensembles into compact, efficient models ready for deployment.The stunning numbers: from a 5× boost in data efficiency to an eye-popping 17.5× reduction in dataset size for domain adaptation.Who should listen? Engineers, researchers, and curious minds who want to understand how LLM training is shifting in a world where compute is becoming “free,” but high-quality data is the new luxury.And here’s the question for you: if compute is no longer a constraint, which forgotten algorithms and older AI ideas should we bring back to life? Could they hold the key to the next big breakthrough?Subscribe now so you don’t miss new insights—and share your thoughts in the comments. Sometimes the discussion is just as valuable as the episode itself.Key Takeaways:Compute is no longer the bottleneck—data is the real scarce resource.Strong regularization and ensembling massively boost data efficiency.Distillation makes ensemble power practical for deployment.Algorithmic techniques can deliver up to 17.5× data savings in real tasks.SEO Tags:Niche: #LLM, #DataEfficiency, #Regularization, #EnsemblingPopular: #ArtificialIntelligence, #MachineLearning, #DeepLearning, #AITrends, #TechPodcastLong-tail: #OptimizingModelTraining, #DataEfficiencyInAI, #FutureOfLLMsTrending: #AI2025, #GenerativeAI, #LLMResearchRead more: https://arxiv.org/abs/2509.14786
No persons identified in this episode.
This episode hasn't been transcribed yet
Help us prioritize this episode for transcription by upvoting it.
Popular episodes get transcribed faster
Other recent transcribed episodes
Transcribed and ready to explore now
Eric Larsen on the emergence and potential of AI in healthcare
10 Dec 2025
McKinsey on Healthcare
Reducing Burnout and Boosting Revenue in ASCs
10 Dec 2025
Becker’s Healthcare -- Spine and Orthopedic Podcast
Dr. Erich G. Anderer, Chief of the Division of Neurosurgery and Surgical Director of Perioperative Services at NYU Langone Hospital–Brooklyn
09 Dec 2025
Becker’s Healthcare -- Spine and Orthopedic Podcast
Dr. Nolan Wessell, Assistant Professor and Well-being Co-Director, Department of Orthopedic Surgery, Division of Spine Surgery, University of Colorado School of Medicine
08 Dec 2025
Becker’s Healthcare -- Spine and Orthopedic Podcast
NPR News: 12-08-2025 2AM EST
08 Dec 2025
NPR News Now
NPR News: 12-08-2025 1AM EST
08 Dec 2025
NPR News Now