[MINI] The Bonferroni Correction
22 Jan 2016
Contributed by Lukas
Today's episode begins by asking how many left handed employees we should expect to be at a company ...
Detecting Pseudo-profound BS
15 Jan 2016
Contributed by Lukas
A recent paper in the journal of Judgment and Decision Making titled On the reception and detection...
[MINI] Gradient Descent
08 Jan 2016
Contributed by Lukas
Today's mini episode discusses the widely known optimization algorithm gradient descent in the conte...
Let's Kill the Word Cloud
01 Jan 2016
Contributed by Lukas
This episode is a discussion of data visualization and a proposed New Year's resolution for Data Ske...
2015 Holiday Special
25 Dec 2015
Contributed by Lukas
Today's episode is a reading of Isaac Asimov's The Machine that Won the War. I can't think of a sto...
Wikipedia Revision Scoring as a Service
18 Dec 2015
Contributed by Lukas
In this interview with Aaron Halfaker of the Wikimedia Foundation, we discuss his research and caree...
[MINI] Term Frequency - Inverse Document Frequency
11 Dec 2015
Contributed by Lukas
Today's topic is term frequency inverse document frequency, which is a statistic for estimating the ...
The Hunt for Vulcan
04 Dec 2015
Contributed by Lukas
Early astronomers could see several of the planets with the naked eye. The invention of the telescop...
[MINI] The Accuracy Paradox
27 Nov 2015
Contributed by Lukas
Today's episode discusses the accuracy paradox. There are cases when one might prefer a less accurat...
Neuroscience from a Data Scientist's Perspective
20 Nov 2015
Contributed by Lukas
... or should this have been called data science from a neuroscientist's perspective? Either way, I'...
[MINI] Bias Variance Tradeoff
13 Nov 2015
Contributed by Lukas
A discussion of the expected number of cars at a stoplight frames today's discussion of the bias var...
Big Data Doesn't Exist
06 Nov 2015
Contributed by Lukas
The recent opinion piece Big Data Doesn't Exist on Tech Crunch by Slater Victoroff is an interesting...
[MINI] Covariance and Correlation
30 Oct 2015
Contributed by Lukas
The degree to which two variables change together can be calculated in the form of their covariance....
Bayesian A/B Testing
23 Oct 2015
Contributed by Lukas
Today's guest is Cameron Davidson-Pilon. Cameron has a masters degree in quantitative finance from t...
[MINI] The Central Limit Theorem
16 Oct 2015
Contributed by Lukas
The central limit theorem is an important statistical result which states that typically, the mean o...
Accessible Technology
09 Oct 2015
Contributed by Lukas
Today's guest is Chris Hofstader (@gonz_blinko), an accessibility researcher and advocate, as well a...
[MINI] Multi-armed Bandit Problems
02 Oct 2015
Contributed by Lukas
The multi-armed bandit problem is named with reference to slot machines (one armed bandits). Given t...
Shakespeare, Abiogenesis, and Exoplanets
25 Sep 2015
Contributed by Lukas
Our episode this week begins with a correction. Back in episode 28 (Monkeys on Typewriters), Kyle ma...
[MINI] Sample Sizes
18 Sep 2015
Contributed by Lukas
There are several factors that are important to selecting an appropriate sample size and dealing wit...
The Model Complexity Myth
11 Sep 2015
Contributed by Lukas
There's an old adage which says you cannot fit a model which has more parameters than you have data....
[MINI] Distance Measures
04 Sep 2015
Contributed by Lukas
There are many occasions in which one might want to know the distance or similarity between two thin...
ContentMine
28 Aug 2015
Contributed by Lukas
ContentMine is a project which provides the tools and workflow to convert scientific literature into...
[MINI] Structured and Unstructured Data
21 Aug 2015
Contributed by Lukas
Today's mini-episode explains the distinction between structured and unstructured data, and debates ...
Measuring the Influence of Fashion Designers
14 Aug 2015
Contributed by Lukas
Yusan Lin shares her research on using data science to explore the fashion industry in this episode....
[MINI] PageRank
07 Aug 2015
Contributed by Lukas
PageRank is the algorithm most famous for being one of the original innovations that made Google sta...
Data Science at Work in LA County
29 Jul 2015
Contributed by Lukas
In this episode, Benjamin Uminsky enlightens us about some of the ways the Los Angeles County Regist...
[MINI] k-Nearest Neighbors
24 Jul 2015
Contributed by Lukas
This episode explores the k-nearest neighbors algorithm which is an unsupervised, non-parametric met...
Crypto
17 Jul 2015
Contributed by Lukas
How do people think rationally about small probability events? What is the optimal statistical proce...
[MINI] MapReduce
10 Jul 2015
Contributed by Lukas
This mini-episode is a high level explanation of the basic idea behind MapReduce, which is a fundam...
Genetically Engineered Food and Trends in Herbicide Usage
03 Jul 2015
Contributed by Lukas
The Credible Hulk joins me in this episode to discuss a recent blog post he wrote about glyphosa...
[MINI] The Curse of Dimensionality
26 Jun 2015
Contributed by Lukas
More features are not always better! With an increasing number of features to consider, machine lea...
Video Game Analytics
19 Jun 2015
Contributed by Lukas
This episode discusses video game analytics with guest Anders Drachen. The way in which people get...
[MINI] Anscombe's Quartet
12 Jun 2015
Contributed by Lukas
This mini-episode discusses Anscombe's Quartet, a series of four datasets which are clearly very d...
Proposing Annoyance Mining
09 Jun 2015
Contributed by Lukas
A recent episode of the Skeptics Guide to the Universe included a slight rant by Dr. Novella and the...
Preserving History at Cyark
05 Jun 2015
Contributed by Lukas
Elizabeth Lee from CyArk joins us in this episode to share stories of the work done capturing imp...
[MINI] A Critical Examination of a Study of Marriage by Political Affiliation
29 May 2015
Contributed by Lukas
Linhda and Kyle review a New York Times article titled How Your Hometown Affects Your Chances of M...
Detecting Cheating in Chess
22 May 2015
Contributed by Lukas
With the advent of algorithms capable of beating highly ranked chess players, the temptation to che...
[MINI] z-scores
15 May 2015
Contributed by Lukas
This week's episode dicusses z-scores, also known as standard score. This score describes the dista...
Using Data to Help Those in Crisis
08 May 2015
Contributed by Lukas
This week Noelle Sio Saldana discusses her volunteer work at Crisis Text Line - a 24/7 service that...
The Ghost in the MP3
01 May 2015
Contributed by Lukas
Have you ever wondered what is lost when you compress a song into an MP3? This week's guest Ryan Ma...
Data Fest 2015
28 Apr 2015
Contributed by Lukas
This episode contains converage of the 2015 Data Fest hosted at UCLA. Data Fest is an analysis com...
[MINI] Cornbread and Overdispersion
24 Apr 2015
Contributed by Lukas
For our 50th episode we enduldge a bit by cooking Linhda's previously mentioned "healthy" cornbread...
[MINI] Natural Language Processing
17 Apr 2015
Contributed by Lukas
This episode overviews some of the fundamental concepts of natural language processing including st...
Computer-based Personality Judgments
10 Apr 2015
Contributed by Lukas
Guest Youyou Wu discuses the work she and her collaborators did to measure the accuracy of computer...
[MINI] Markov Chain Monte Carlo
03 Apr 2015
Contributed by Lukas
This episode explores how going wine testing could teach us about using markov chain monte carlo (m...
[MINI] Markov Chains
20 Mar 2015
Contributed by Lukas
This episode introduces the idea of a Markov Chain. A Markov Chain has a set of states describing a...
Oceanography and Data Science
13 Mar 2015
Contributed by Lukas
Nicole Goebel joins us this week to share her experiences in oceanography studying phytoplankton an...
[MINI] Ordinary Least Squares Regression
06 Mar 2015
Contributed by Lukas
This episode explores Ordinary Least Squares or OLS - a method for finding a good fit which describe...
NYC Speed Camera Analysis with Tim Schmeier
27 Feb 2015
Contributed by Lukas
New York State approved the use of automated speed cameras within a specific range of schools. Tim ...
[MINI] k-means clustering
20 Feb 2015
Contributed by Lukas
The k-means clustering algorithm is an algorithm that computes a deterministic label for a given "k"...
Shadow Profiles on Social Networks
13 Feb 2015
Contributed by Lukas
Emre Sarigol joins me this week to discuss his paper Online Privacy as a Collective Phenomenon. T...
[MINI] The Chi-Squared Test
06 Feb 2015
Contributed by Lukas
The Chi-Squared test is a methodology for hypothesis testing. When one has categorical data, in the ...
Mapping Reddit Topics with Randy Olson
30 Jan 2015
Contributed by Lukas
My quest this week is noteworthy a.i. researcher Randy Olson who joins me to share his work creat...
[MINI] Partially Observable State Spaces
23 Jan 2015
Contributed by Lukas
When dealing with dynamic systems that are potentially undergoing constant change, its helpful to de...
Easily Fooling Deep Neural Networks
16 Jan 2015
Contributed by Lukas
My guest this week is Anh Nguyen, a PhD student at the University of Wyoming working in the Evolvi...
[MINI] Data Provenance
09 Jan 2015
Contributed by Lukas
This episode introduces a high level discussion on the topic of Data Provenance, with more MINI epi...
Doubtful News, Geology, Investigating Paranormal Groups, and Thinking Scientifically with Sharon Hill
03 Jan 2015
Contributed by Lukas
I had the change to speak with well known Sharon Hill (@idoubtit) for the first episode of 2015. ...
[MINI] Belief in Santa
26 Dec 2014
Contributed by Lukas
In this quick holiday episode, we touch on how one would approach modeling the statistical distribut...
Economic Modeling and Prediction, Charitable Giving, and a Follow Up with Peter Backus
19 Dec 2014
Contributed by Lukas
Economist Peter Backus joins me in this episode to discuss a few interesting topics. You may recall...
[MINI] The Battle of the Sexes
12 Dec 2014
Contributed by Lukas
Love and Data is the continued theme in this mini-episode as we discuss the game theory example of ...
The Science of Online Data at Plenty of Fish with Thomas Levi
05 Dec 2014
Contributed by Lukas
Can algorithms help you find love? Many happy couples successfully brought together via online dati...
[MINI] The Girlfriend Equation
28 Nov 2014
Contributed by Lukas
Economist Peter Backus put forward "The Girlfriend Equation" while working on his PhD - a probabili...
The Secret and the Global Consciousness Project with Alex Boklin
21 Nov 2014
Contributed by Lukas
I'm joined this week by Alex Boklin to explore the topic of magical thinking especially in the cont...
[MINI] Monkeys on Typewriters
14 Nov 2014
Contributed by Lukas
What is randomness? How can we determine if some results are randomly generated or not? Why are rand...
Mining the Social Web with Matthew Russell
07 Nov 2014
Contributed by Lukas
This week's episode explores the possibilities of extracting novel insights from the many great soc...
[MINI] Is the Internet Secure?
31 Oct 2014
Contributed by Lukas
This episode explores the basis of why we can trust encryption. Suprisingly, a discussion of looki...
Practicing and Communicating Data Science with Jeff Stanton
24 Oct 2014
Contributed by Lukas
Jeff Stanton joins me in this episode to discuss his book An Introduction to Data Science, and som...
[MINI] The T-Test
17 Oct 2014
Contributed by Lukas
The t-test is this week's mini-episode topic. The t-test is a statistical testing procedure used to ...
Data Myths with Karl Mamer
10 Oct 2014
Contributed by Lukas
This week I'm joined by Karl Mamer to discuss the data behind three well known urban legends. Did a...
Contest Announcement
08 Oct 2014
Contributed by Lukas
The Data Skeptic Podcast is launching a contest- not one of chance, but one of skill. Listeners are...
[MINI] Selection Bias
03 Oct 2014
Contributed by Lukas
A discussion about conducting US presidential election polls helps frame a converation about selecti...
[MINI] Confidence Intervals
26 Sep 2014
Contributed by Lukas
Commute times and BBQ invites help frame a discussion about the statistical concept of confidence in...
[MINI] Value of Information
19 Sep 2014
Contributed by Lukas
A discussion about getting ready in the morning, negotiating a used car purchase, and selecting the ...
Game Science Dice with Louis Zocchi
17 Sep 2014
Contributed by Lukas
In this bonus episode, guest Louis Zocchi discusses his background in the gaming industry, specifica...
Data Science at ZestFinance with Marick Sinay
12 Sep 2014
Contributed by Lukas
Marick Sinay from ZestFianance is our guest this weel. This episode explores how data science tech...
[MINI] Decision Tree Learning
05 Sep 2014
Contributed by Lukas
Linhda and Kyle talk about Decision Tree Learning in this miniepisode. Decision Tree Learning is t...
Jackson Pollock Authentication Analysis with Kate Jones-Smith
29 Aug 2014
Contributed by Lukas
Our guest this week is Hamilton physics professor Kate Jones-Smith who joins us to discuss the evi...
[MINI] Noise!!
22 Aug 2014
Contributed by Lukas
Our topic for this week is "noise" as in signal vs. noise. This is not a signal processing discuss...
Guerilla Skepticism on Wikipedia with Susan Gerbic
15 Aug 2014
Contributed by Lukas
Our guest this week is Susan Gerbic. Susan is a skeptical activist involved in many activities, the ...
[MINI] Ant Colony Optimization
08 Aug 2014
Contributed by Lukas
In this week's mini episode, Linhda and Kyle discuss Ant Colony Optimization - a numerical / stochas...
Data in Healthcare IT with Shahid Shah
01 Aug 2014
Contributed by Lukas
Our guest this week is Shahid Shah. Shahid is CEO at Netspective, and writes three blogs: Health Car...
[MINI] Cross Validation
25 Jul 2014
Contributed by Lukas
This miniepisode discusses the technique called Cross Validation - a process by which one randomly d...
Streetlight Outage and Crime Rate Analysis with Zach Seeskin
18 Jul 2014
Contributed by Lukas
This episode features a discussion with statistics PhD student Zach Seeskin about a project he was i...
[MINI] Experimental Design
11 Jul 2014
Contributed by Lukas
This episode loosely explores the topic of Experimental Design including hypothesis testing, the imp...
The Right (big data) Tool for the Job with Jay Shankar
07 Jul 2014
Contributed by Lukas
In this week's episode, we discuss applied solutions to big data problem with big data engineer Jay ...
[MINI] Bayesian Updating
27 Jun 2014
Contributed by Lukas
In this minisode, we discuss Bayesian Updating - the process by which one can calculate the most lik...
Personalized Medicine with Niki Athanasiadou
20 Jun 2014
Contributed by Lukas
In the second full length episode of the podcast, we discuss the current state of personalized medic...
[MINI] p-values
13 Jun 2014
Contributed by Lukas
In this mini, we discuss p-values and their use in hypothesis testing, in the context of an hypothet...
Advertising Attribution with Nathan Janos
06 Jun 2014
Contributed by Lukas
A conversation with Convertro's Nathan Janos about methodologies used to help advertisers understand...
[MINI] type i / type ii errors
30 May 2014
Contributed by Lukas
In this first mini-episode of the Data Skeptic Podcast, we define and discuss type i and type ii err...
Introduction
23 May 2014
Contributed by Lukas
The Data Skeptic Podcast features conversations with topics related to data science, statistics, mac...