Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing
Podcast Image

Data Skeptic

Technology Science

Activity Overview

Episode publication activity over the past year

Episodes

Showing 401-500 of 591
«« ← Prev Page 5 of 6 Next → »»

Mercedes Benz Machine Learning Research

14 Dec 2017

Contributed by Lukas

This episode features an interview with Rigel Smiroldo recorded at NIPS 2017 in Long Beach California.  We discuss data privacy, machine learning use...

[MINI] Parallel Algorithms

08 Dec 2017

Contributed by Lukas

When computers became commodity hardware and storage became incredibly cheap, we entered the era of so-call "big" data. Most definitions of big data w...

Quantum Computing

01 Dec 2017

Contributed by Lukas

In this week's episode, Scott Aaronson, a professor at the University of Texas at Austin, explains what a quantum computer is, various possible appli...

Azure Databricks

28 Nov 2017

Contributed by Lukas

I sat down with Ali Ghodsi, CEO and found of Databricks, and John Chirapurath, GM for Data Platform Marketing at Microsoft related to the recent annou...

[MINI] Exponential Time Algorithms

24 Nov 2017

Contributed by Lukas

In this episode we discuss the complexity class of EXP-Time which contains algorithms which require $O(2^{p(n)})$ time to run.  In other words, the w...

P vs NP

17 Nov 2017

Contributed by Lukas

In this week's episode, host Kyle Polich interviews author Lance Fortnow about whether P will ever be equal to NP and solve all of life's problems. Fo...

[MINI] Sudoku \in NP

10 Nov 2017

Contributed by Lukas

Algorithms with similar runtimes are said to be in the same complexity class. That runtime is measured in the how many steps an algorithm takes relati...

The Computational Complexity of Machine Learning

03 Nov 2017

Contributed by Lukas

In this episode, Professor Michael Kearns from the University of Pennsylvania joins host Kyle Polich to talk about the computational complexity of mac...

[MINI] Turing Machines

27 Oct 2017

Contributed by Lukas

TMs are a model of computation at the heart of algorithmic analysis.  A Turing Machine has two components.  An infinitely long piece of tape (memory...

The Complexity of Learning Neural Networks

20 Oct 2017

Contributed by Lukas

Over the past several years, we have seen many success stories in machine learning brought about by deep learning techniques. While the practical succ...

[MINI] Big Oh Analysis

13 Oct 2017

Contributed by Lukas

How long an algorithm takes to run depends on many factors including implementation details and hardware.  However, the formal analysis of algorithms...

Data science tools and other announcements from Ignite

06 Oct 2017

Contributed by Lukas

In this episode, Microsoft's Corporate Vice President for Cloud Artificial Intelligence, Joseph Sirosh, joins host Kyle Polich to share some of the Mi...

Generative AI for Content Creation

29 Sep 2017

Contributed by Lukas

Last year, the film development and production company End Cue produced a short film, called Sunspring, that was entirely written by an artificial int...

[MINI] One Shot Learning

22 Sep 2017

Contributed by Lukas

One Shot Learning is the class of machine learning procedures that focuses learning something from a small number of examples.  This is in contrast t...

Recommender Systems Live from FARCON 2017

15 Sep 2017

Contributed by Lukas

Recommender systems play an important role in providing personalized content to online users. Yet, typical data mining techniques are not well suited ...

[MINI] Long Short Term Memory

08 Sep 2017

Contributed by Lukas

Thanks to our sponsor brilliant.org/dataskeptics A Long Short Term Memory (LSTM) is a neural unit, often used in Recurrent Neural Network (RNN) which...

Zillow Zestimate

01 Sep 2017

Contributed by Lukas

Zillow is a leading real estate information and home-related marketplace. We interviewed Andrew Martin, a data science Research Manager at Zillow, to ...

Cardiologist Level Arrhythmia Detection with CNNs

25 Aug 2017

Contributed by Lukas

Our guest Pranav Rajpurkar and his coauthored recently published Cardiologist-Level Arrhythmia Detection with Convolutional Neural Networks, a paper ...

[MINI] Recurrent Neural Networks

18 Aug 2017

Contributed by Lukas

RNNs are a class of deep learning models designed to capture sequential behavior.  An RNN trains a set of weights which depend not just on new input ...

Project Common Voice

11 Aug 2017

Contributed by Lukas

Thanks to our sponsor Springboard. In this week's episode, guest Andre Natal from Mozilla joins our host, Kyle Polich, to discuss a couple exciting n...

[MINI] Bayesian Belief Networks

04 Aug 2017

Contributed by Lukas

A Bayesian Belief Network is an acyclic directed graph composed of nodes that represent random variables and edges that imply a conditional dependence...

pix2code

28 Jul 2017

Contributed by Lukas

In this episode, Tony Beltramelli of UIzard Technologies joins our host, Kyle Polich, to talk about the ideas behind his latest app that can trans...

[MINI] Conditional Independence

21 Jul 2017

Contributed by Lukas

In statistics, two random variables might depend on one another (for example, interest rates and new home purchases). We call this conditional depende...

Estimating Sheep Pain with Facial Recognition

14 Jul 2017

Contributed by Lukas

Animals can't tell us when they're experiencing pain, so we have to rely on other cues to help treat their discomfort. But it is often difficult to te...

CosmosDB

07 Jul 2017

Contributed by Lukas

This episode collects interviews from my recent trip to Microsoft Build where I had the opportunity to speak with Dharma Shukla and Syam Nair about ...

[MINI] The Vanishing Gradient

30 Jun 2017

Contributed by Lukas

This episode discusses the vanishing gradient - a problem that arises when training deep neural networks in which nearly all the gradients are very cl...

Doctor AI

23 Jun 2017

Contributed by Lukas

hen faced with medical issues, would you want to be seen by a human or a machine? In this episode, guest Edward Choi, co-author of the study titled Do...

[MINI] Activation Functions

16 Jun 2017

Contributed by Lukas

In a neural network, the output value of a neuron is almost always transformed in some way using a function. A trivial choice would be a linear transf...

MS Build 2017

09 Jun 2017

Contributed by Lukas

This episode recaps the Microsoft Build Conference.  Kyle recently attended and shares some thoughts on cloud, databases, cognitive services, and art...

[MINI] Max-pooling

02 Jun 2017

Contributed by Lukas

Max-pooling is a procedure in a neural network which has several benefits. It performs dimensionality reduction by taking a collection of neurons and ...

Unsupervised Depth Perception

26 May 2017

Contributed by Lukas

This episode is an interview with Tinghui Zhou.  In the recent paper "Unsupervised Learning of Depth and Ego-motion from Video", Tinghui and collabor...

[MINI] Convolutional Neural Networks

19 May 2017

Contributed by Lukas

CNNs are characterized by their use of a group of neurons typically referred to as a filter or kernel.  In image recognition, this kernel is repeated...

Multi-Agent Diverse Generative Adversarial Networks

12 May 2017

Contributed by Lukas

Despite the success of GANs in imaging, one of its major drawbacks is the problem of 'mode collapse,' where the generator learns to produce samples wi...

[MINI] Generative Adversarial Networks

05 May 2017

Contributed by Lukas

GANs are an unsupervised learning method involving two neural networks iteratively competing. The discriminator is a typical learning system. It attem...

Opinion Polls for Presidential Elections

28 Apr 2017

Contributed by Lukas

Recently, we've seen opinion polls come under some skepticism.  But is that skepticism truly justified?  The recent Brexit referendum and US 2016 Pr...

OpenHouse

21 Apr 2017

Contributed by Lukas

No reliable, complete database cataloging home sales data at a transaction level is available for the average person to access. To a data scientist in...

[MINI] GPU CPU

14 Apr 2017

Contributed by Lukas

There's more than one type of computer processor. The central processing unit (CPU) is typically what one means when they say "processor". GPUs were i...

[MINI] Backpropagation

07 Apr 2017

Contributed by Lukas

Backpropagation is a common algorithm for training a neural network.  It works by computing the gradient of each weight with respect to the overall e...

Data Science at Patreon

31 Mar 2017

Contributed by Lukas

  In this week's episode of Data Skeptic, host Kyle Polich talks with guest Maura Church, Patreon's data science manager. Patreon is a fast-growing ...

[MINI] Feed Forward Neural Networks

24 Mar 2017

Contributed by Lukas

Feed Forward Neural Networks In a feed forward neural network, neurons cannot form a cycle. In this episode, we explore how such a network would be ab...

Reinventing Sponsored Search Auctions

17 Mar 2017

Contributed by Lukas

In this Data Skeptic episode, Kyle is joined by guest Ruggiero Cavallo to discuss his latest efforts to mitigate the problems presented in this new wo...

[MINI] The Perceptron

10 Mar 2017

Contributed by Lukas

Today's episode overviews the perceptron algorithm. This rather simple approach is characterized by a few particular features. It updates its weights ...

The Data Refuge Project

03 Mar 2017

Contributed by Lukas

DataRefuge is a public collaborative, grassroots effort around the United States in which scientists, researchers, computer scientists, librarians and...

[MINI] Automated Feature Engineering

24 Feb 2017

Contributed by Lukas

If a CEO wants to know the state of their business, they ask their highest ranking executives. These executives, in turn, should know the state of the...

Big Data Tools and Trends

17 Feb 2017

Contributed by Lukas

In this episode, I speak with Raghu Ramakrishnan, CTO for Data at Microsoft.  We discuss services, tools, and developments in the big data sphere as ...

[MINI] Primer on Deep Learning

10 Feb 2017

Contributed by Lukas

In this episode, we talk about a high-level description of deep learning.  Kyle presents a simple game (pictured below), which is more of a puzzle re...

Data Provenance and Reproducibility with Pachyderm

03 Feb 2017

Contributed by Lukas

Versioning isn't just for source code. Being able to track changes to data is critical for answering questions about data provenance, quality, and rep...

[MINI] Logistic Regression on Audio Data

27 Jan 2017

Contributed by Lukas

Logistic Regression is a popular classification algorithm. In this episode, we discuss how it can be used to determine if an audio clip represents one...

Studying Competition and Gender Through Chess

20 Jan 2017

Contributed by Lukas

Prior work has shown that people's response to competition is in part predicted by their gender. Understanding why and when this occurs is important i...

[MINI] Dropout

13 Jan 2017

Contributed by Lukas

Deep learning can be prone to overfit a given problem. This is especially frustrating given how much time and computational resources are often requir...

The Police Data and the Data Driven Justice Initiatives

06 Jan 2017

Contributed by Lukas

In this episode I speak with Clarence Wardell and Kelly Jin about their mutual service as part of the White House's Police Data Initiative and Data Dr...

The Library Problem

30 Dec 2016

Contributed by Lukas

We close out 2016 with a discussion of a basic interview question which might get asked when applying for a data science job. Specifically, how a libr...

2016 Holiday Special

23 Dec 2016

Contributed by Lukas

Today's episode is a reading of Isaac Asimov's Franchise.  As mentioned on the show, this is just a work of fiction to be enjoyed and not in any way...

[MINI] Entropy

16 Dec 2016

Contributed by Lukas

Classically, entropy is a measure of disorder in a system. From a statistical perspective, it is more useful to say it's a measure of the unpredictabi...

MS Connect Conference

09 Dec 2016

Contributed by Lukas

Cloud services are now ubiquitous in data science and more broadly in technology as well. This week, I speak to Mark Souza, Tobias Ternström, and Cor...

Causal Impact

02 Dec 2016

Contributed by Lukas

Today's episode is all about Causal Impact, a technique for estimating the impact of a particular event on a time series. We talk to William Martin ab...

[MINI] The Bootstrap

25 Nov 2016

Contributed by Lukas

The Bootstrap is a method of resampling a dataset to possibly refine it's accuracy and produce useful metrics on the result. The bootstrap is a useful...

[MINI] Gini Coefficients

18 Nov 2016

Contributed by Lukas

The Gini Coefficient (as it relates to decision trees) is one approach to determining the optimal decision to introduce which splits your dataset as p...

Unstructured Data for Finance

11 Nov 2016

Contributed by Lukas

Financial analysis techniques for studying numeric, well structured data are very mature. While using unstructured data in finance is not necessarily ...

[MINI] AdaBoost

04 Nov 2016

Contributed by Lukas

AdaBoost is a canonical example of the class of AnyBoost algorithms that create ensembles of weak learners. We discuss how a complex problem like pred...

Stealing Models from the Cloud

28 Oct 2016

Contributed by Lukas

Platform as a service is a growing trend in data science where services like fraud analysis and face detection can be provided via APIs. Such services...

[MINI] Calculating Feature Importance

21 Oct 2016

Contributed by Lukas

For machine learning models created with the random forest algorithm, there is no obvious diagnostic to inform you which features are more important i...

NYC Bike Share Rebalancing

14 Oct 2016

Contributed by Lukas

As cities provide bike sharing services, they must also plan for how to redistribute bicycles as they inevitably build up at more popular destination ...

[MINI] Random Forest

07 Oct 2016

Contributed by Lukas

Random forest is a popular ensemble learning algorithm which leverages bagging both for sampling and feature selection. In this episode we make an ana...

Election Predictions

30 Sep 2016

Contributed by Lukas

Jo Hardin joins us this week to discuss the ASA's Election Prediction Contest. This is a competition aimed at forecasting the results of the upcoming ...

[MINI] F1 Score

23 Sep 2016

Contributed by Lukas

The F1 score is a model diagnostic that combines precision and recall to provide a singular evaluation for model comparison.  In this episode we disc...

Urban Congestion

16 Sep 2016

Contributed by Lukas

Urban congestion effects every person living in a city of any reasonable size. Lewis Lehe joins us in this episode to share his work on downtown conge...

[MINI] Heteroskedasticity

09 Sep 2016

Contributed by Lukas

Heteroskedasticity is a term used to describe a relationship between two variables which has unequal variance over the range.  For example, the varia...

Music21

02 Sep 2016

Contributed by Lukas

Our guest today is Michael Cuthbert, an associate professor of music at MIT and principal investigator of the Music21 project, which we focus our disc...

[MINI] Paxos

26 Aug 2016

Contributed by Lukas

Paxos is a protocol for arriving a consensus in a distributed computing system which accounts for unreliability of the nodes.  We discuss how this mi...

Trusting Machine Learning Models with LIME

19 Aug 2016

Contributed by Lukas

Machine learning models are often criticized for being black boxes. If a human cannot determine why the model arrives at the decision it made, there's...

[MINI] ANOVA

12 Aug 2016

Contributed by Lukas

Analysis of variance is a method used to evaluate differences between the two or more groups.  It works by breaking down the total variance of the sy...

Machine Learning on Images with Noisy Human-centric Labels

05 Aug 2016

Contributed by Lukas

When humans describe images, they have a reporting bias, in that the report only what they consider important. Thus, in addition to considering whethe...

[MINI] Survival Analysis

29 Jul 2016

Contributed by Lukas

Survival analysis techniques are useful for studying the longevity of groups of elements or individuals, taking into account time considerations and r...

Predictive Models on Random Data

22 Jul 2016

Contributed by Lukas

This week is an insightful discussion with Claudia Perlich about some situations in machine learning where models can be built, perhaps by well-intent...

[MINI] Receiver Operating Characteristic (ROC) Curve

15 Jul 2016

Contributed by Lukas

An ROC curve is a plot that compares the trade off of true positives and false positives of a binary classifier under different thresholds. The area u...

Multiple Comparisons and Conversion Optimization

08 Jul 2016

Contributed by Lukas

I'm joined by Chris Stucchio this week to discuss how deliberate or uninformed statistical practitioners can derive spurious and arbitrary results via...

[MINI] Leakage

01 Jul 2016

Contributed by Lukas

If you'd like to make a good prediction, your best bet is to invent a time machine, visit the future, observe the value, and return to the past. For t...

Predictive Policing

24 Jun 2016

Contributed by Lukas

Kristian Lum (@KLdivergence) joins me this week to discuss her work at @hrdag on predictive policing. We also discuss Multiple Systems Estimation, a ...

[MINI] The CAP Theorem

17 Jun 2016

Contributed by Lukas

Distributed computing cannot guarantee consistency, accuracy, and partition tolerance. Most system architects need to think carefully about how they s...

Detecting Terrorists with Facial Recognition?

10 Jun 2016

Contributed by Lukas

A startup is claiming that they can detect terrorists purely through facial recognition. In this solo episode, Kyle explores the plausibility of these...

[MINI] Goodhart's Law

03 Jun 2016

Contributed by Lukas

Goodhart's law states that "When a measure becomes a target, it ceases to be a good measure". In this mini-episode we discuss how this affects SEO, ca...

Data Science at eHarmony

27 May 2016

Contributed by Lukas

I'm joined this week by Jon Morra, director of data science at eHarmony to discuss a variety of ways in which machine learning and data science are be...

[MINI] Stationarity and Differencing

20 May 2016

Contributed by Lukas

Mystery shoppers and fruit cultivation help us discuss stationarity - a property of some time serieses that are invariant to time in several ways. Di...

Feather

13 May 2016

Contributed by Lukas

I'm joined by Wes McKinney (@wesmckinn) and Hadley Wickham (@hadleywickham) on this episode to discuss their joint project Feather. Feather is a file ...

[MINI] Bargaining

06 May 2016

Contributed by Lukas

Bargaining is the process of two (or more) parties attempting to agree on the price for a transaction.  Game theoretic approaches attempt to find two...

deepjazz

29 Apr 2016

Contributed by Lukas

Deepjazz is a project from Ji-Sung Kim, a computer science student at Princeton University. It is built using Theano, Keras, music21, and Evan Chow's ...

[MINI] Auto-correlative functions and correlograms

22 Apr 2016

Contributed by Lukas

When working with time series data, there are a number of important diagnostics one should consider to help understand more about the data. The aut...

Early Identification of Violent Criminal Gang Members

15 Apr 2016

Contributed by Lukas

This week I spoke with Elham Shaabani and Paulo Shakarian (@PauloShakASU) about their recent paper Early Identification of Violent Criminal Gang Membe...

[MINI] Fractional Factorial Design

08 Apr 2016

Contributed by Lukas

A dinner party at Data Skeptic HQ helps teach the uses of fractional factorial design for studying 2-way interactions.

Machine Learning Done Wrong

01 Apr 2016

Contributed by Lukas

Cheng-tao Chu (@chengtao_chu) joins us this week to discuss his perspective on common mistakes and pitfalls that are made when doing machine learning....

Potholes

25 Mar 2016

Contributed by Lukas

Co-host Linh Da was in a biking accident after hitting a pothole. She sustained an injury that required stitches. This is the story of our quest to fi...

[MINI] The Elbow Method

18 Mar 2016

Contributed by Lukas

Certain data mining algorithms (including k-means clustering and k-nearest neighbors) require a user defined parameter k. A user of these algorithms i...

Too Good to be True

11 Mar 2016

Contributed by Lukas

Today on Data Skeptic, Lachlan Gunn joins us to discuss his recent paper Too Good to be True. This paper highlights a somewhat paradoxical / counteri...

[MINI] R-squared

04 Mar 2016

Contributed by Lukas

How well does your model explain your data? R-squared is a useful statistic for answering this question. In this episode we explore how it applies to ...

Models of Mental Simulation

26 Feb 2016

Contributed by Lukas

    Jessica Hamrick joins us this week to discuss her work studying mental simulation. Her research combines machine learning approaches iwth beh...

[MINI] Multiple Regression

19 Feb 2016

Contributed by Lukas

This episode is a discussion of multiple regression: the use of observations that are a vector of values to predict a response variable. For this e...

Scientific Studies of People's Relationship to Music

12 Feb 2016

Contributed by Lukas

Samuel Mehr joins us this week to share his perspective on why people are musical, where music comes from, and why it works the way it does. We discus...

[MINI] k-d trees

05 Feb 2016

Contributed by Lukas

This episode reviews the concept of k-d trees: an efficient data structure for holding multidimensional objects. Kyle gives Linhda a dictionary and as...

Auditing Algorithms

29 Jan 2016

Contributed by Lukas

Algorithms are pervasive in our society and make thousands of automated decisions on our behalf every day. The possibility of digital discrimination i...

«« ← Prev Page 5 of 6 Next → »»