Are you overwhelmed by the sheer number of Large Language Models (LLMs) available? Choosing the right LLM for your project isn't about picking the most popular one – it's about understanding your specific needs and rigorously evaluating your options.In this episode of Two Voice Devs, Allen Firstenberg and guest host Brad Nemer, a seasoned product manager, dive deep into the world of LLM evaluation. They go beyond the marketing buzz and explore practical tools and strategies for making informed decisions.Whether you're a developer, a product manager, or just curious about the practical applications of LLMs, this episode provides invaluable insights into making the right choices for your projects. Don't get caught up in the hype – learn how to evaluate LLMs effectively!More Info:https://www.udacity.com/blog/2025/01/how-to-choose-the-right-ai-model-for-your-product.html[00:00:00] Introduction: Meet Brad Niemer[00:00:38] Brad's Journey to Product Management & AI[00:03:12] Collaboration with Noble Ackerson and the LLM Evaluation Challenge[00:05:23] The Role of a Product Manager.[00:07:43] Product manager relation to engineering.[00:13:46] Exploring Evaluation Tools: Hugging Face[00:16:58] Exploring Evaluation Tools: Chatbot Arena (Human Evaluation)[00:20:30] Chatbot Arena: Code Generation Evaluation[00:24:43] Evaluating LLMs: Beyond Chatbots and Truth[00:26:11] Exploring Evaluation Tools: Artificial Analysis (Quality, Speed, Price)[00:28:47] Exploring Evaluation Tools: Galileo (Hallucination Report)[00:31:16] Case Study: DeepSeek and the Importance of Contextual Evaluation[00:34:53] The Future of LLM Testing and Quality Assurance[00:37:49] Wrap Up contact information.#LLM #LargeLanguageModels #AIEvaluation #ProductManagement #TechTalk #TwoVoiceDevs #HuggingFace #GenAI #GenerativeAI #ChatbotArena #ArtificialAnalysis #Galileo #DeepSeek #ChatGPT #Gemini #Mistral #Claude #ModelSelection #AIdevelopment #SoftwareDevelopment #Testing #QA #RAG #MachineLearning #NLP #Coding #TechPodcast #YouTubeTech #Developers
No persons identified in this episode.
This episode hasn't been transcribed yet
Help us prioritize this episode for transcription by upvoting it.
Popular episodes get transcribed faster
Other recent transcribed episodes
Transcribed and ready to explore now
Trump $82 Million Bond Spree, Brazil Tariffs 'Too High,' More
16 Nov 2025
Bloomberg News Now
Ex-Fed Gov Resigned After Rules Violations, Trump Buys $82 Mil of Bonds, More
16 Nov 2025
Bloomberg News Now
THIS TRUMP INTERVIEW WAS INSANE!
16 Nov 2025
HasanAbi
Epstein Emails and Trump's Alleged Involvement
15 Nov 2025
Conspiracy Theories Exploring The Unseen
New Epstein Emails Directly Implicate Trump - H3 Show #211
15 Nov 2025
H3 Podcast
Trump Humiliates Himself on FOX as They Call Him Out
15 Nov 2025
IHIP News