Sample Space
Activity Overview
Episode publication activity over the past year
Episodes
Time for some (extreme) distillation with Thomas van Dongen - founder of the Minish Lab
15 Jan 2025
Contributed by Lukas
Word embeddings might feel like they are a little bit out of fashion. After all, we have attention mechanisms and transformer models now, right? Well,...
Imbalanced learn: regrets and onwards - with Guillaume Lemaitre, maintainer
06 Dec 2024
Contributed by Lukas
Imbalanced learn is one of the most popular scikit-learn projects out there. It has support for resampling techniques which historically have always b...
You want to be in control of your own Copilot - with Ty Dunn, co-founder at Continue.dev
06 Nov 2024
Contributed by Lukas
There are many LLMs that you can use for programming these days. Some of them even go into your IDE like Cursor or Github Copilot. But what if you wan...
What it is like to maintain the scikit-learn docs - with David Arturo Amor Quiroz, scikit-learn docs maintainer
31 Oct 2024
Contributed by Lukas
Scikit-learn's documentation pages are celebrated. But not everyone is aware that the project actually has somebody on payroll to take care of it. In ...
Sqlite can totally do embeddings now - with Alex Garcia, sqlite-vec maintainer
23 Oct 2024
Contributed by Lukas
Vector databases are kind of everywhere these days. There is a big pool of VC's that are pooring money into the ecosystem too. But while all of that i...
How to rethink the notebook - with Akshay Agrawal, co-creator of Marimo
16 Oct 2024
Contributed by Lukas
Jupyter has been a great environment to explore computational ideas, but that doesn't mean that it can be the only environment for interactive coding ...
You are always dealing with many tables - with Madelon Hulsebos
10 Sep 2024
Contributed by Lukas
When you are working on a data pipeline for ML ... you are never dealing with a single table. It always demands different tables for different reasons...
How Narwhals has many end users ... that never use it directly with Marco Gorelli
21 Aug 2024
Contributed by Lukas
When you pip install a package you will for sure end up using it later. But often you will also install a bunch of dependencies and it is very likely ...
Pragmatic data science checklists with Peter Bull - cofounder Drivendata
17 Jul 2024
Contributed by Lukas
A lot of things can (and have) gone wrong when folks tried to apply data science projects. So how might we prevent that? Maybe what we need to do is t...
Model safety, that's a pickle! with Adrin Jalali - scikit-learn maintainer
27 Jun 2024
Contributed by Lukas
Historically it's always been the case that you would use a pickle file to store a trained scikit-learn model on disk for deployment. Pickles make sen...
Moving Towards KDearestNeighbors with Leland McInnes - creator of UMAP
30 May 2024
Contributed by Lukas
Leland McInnes is known for a lot of packages. There's UMAP, but also PyNNDescent and HDBScan. Recently he's also been working on tools to help visual...
Talk like a DataFrame, run like SQL with Phillip Cloud - core-committer on Ibis
02 May 2024
Contributed by Lukas
Ibis is a Python library that offers a single data-frame API, from Python, which can run your queries on many different backends. These include databa...
Enhancing Jupyter with Widgets with Trevor Manz - creator of anywidget.
11 Apr 2024
Contributed by Lukas
In this (first!) episode of Sample Space we talk to Trevor Mantz, the creator of anywidget. It's a (neat!) tool to help you build more interactive not...
Introducing Sample Space
03 Apr 2024
Contributed by Lukas
We're starting a new podcast!