Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Libraries, Datasets, and Items API

Audioscrape organizes every piece of audio under a single canonical hierarchy: Library → Dataset → Item. A library is a curated workspace; a dataset is a grouping of related content inside that library; an item is one piece of audio — a podcast episode, an uploaded talk, a hearing recording, or any future format.

Items are universal. Each one has a source_type field (podcast_episode, upload, youtube, ...) so the same endpoints work across formats. The older /api/podcasts and /api/episodes routes remain as convenience aliases for the most common case, but new integrations should prefer /api/items and /api/libraries — they scale across content types.

Machine-readable spec: For the always-current OpenAPI definition, see the interactive Swagger UI or download /api/openapi.json.

List Libraries

Returns every public library on Audioscrape, ordered by subscriber count. The is_subscribed field on each entry tells you whether the calling user is already a member.

bash
GET /api/libraries
$
curl "https://www.audioscrape.com/api/libraries" \
     -H "Authorization: Bearer YOUR_API_KEY"
Response:
{
  "libraries": [
    {
      "slug": "lukas-library",
      "name": "Lukas’ Library",
      "description": "Tech, AI, and startup podcasts I follow.",
      "curator": "Lukas Schmyrczyk",
      "subscriber_count": 142,
      "dataset_count": 7,
      "episode_count": 1834,
      "is_subscribed": true
    }
  ],
  "total": 1
}

Response Fields

Field Type Description
slug string Stable URL slug, e.g. lukas-library
name string Human-readable library name
description string | null Optional library description
curator string | null Display name of the library owner
subscriber_count integer Number of users subscribed to the library
dataset_count integer Number of public datasets in the library
episode_count integer Sum of episodes across all public datasets
is_subscribed boolean Whether the calling user is a member of this library

Get Library

Returns one library plus the datasets the calling user can see. Public datasets are always returned; workspace-scoped datasets are returned only to members. Private libraries return 404 to non-members.

bash
GET /api/libraries/{slug}
$
curl "https://www.audioscrape.com/api/libraries/lukas-library" \
     -H "Authorization: Bearer YOUR_API_KEY"
Response:
{
  "slug": "lukas-library",
  "name": "Lukas’ Library",
  "description": "Tech, AI, and startup podcasts I follow.",
  "curator": "Lukas Schmyrczyk",
  "subscriber_count": 142,
  "dataset_count": 7,
  "episode_count": 1834,
  "is_subscribed": true,
  "datasets": [
    {
      "id": 12,
      "slug": "ai-research",
      "name": "AI Research Podcasts",
      "description": "Long-form interviews with AI researchers.",
      "visibility": "public",
      "podcast_count": 14,
      "episode_count": 412
    }
  ]
}

Path Parameters

Parameter Type Description
slug string Library slug (e.g. lukas-library)

Dataset Fields

Field Type Description
id integer Numeric dataset ID (stable, used by /api/datasets/{id} and /api/items?dataset_id=)
slug string URL slug for the dataset
name string Human-readable dataset name
description string | null Optional dataset description
visibility string One of public, workspace, or private
podcast_count integer Number of podcasts represented in the dataset
episode_count integer Total items in the dataset

Get Dataset

Returns a dataset plus a reference to its parent library, in a single response. Saves a round-trip when you have a dataset ID from a search result or item record and need the surrounding context.

bash
GET /api/datasets/{id}
$
curl "https://www.audioscrape.com/api/datasets/12" \
     -H "Authorization: Bearer YOUR_API_KEY"
Response:
{
  "id": 12,
  "slug": "ai-research",
  "name": "AI Research Podcasts",
  "description": "Long-form interviews with AI researchers.",
  "visibility": "public",
  "podcast_count": 14,
  "episode_count": 412,
  "library_slug": "lukas-library",
  "library_name": "Lukas’ Library"
}

Path Parameters

Parameter Type Description
id integer Numeric dataset ID

List Items

Universal listing across content types — podcast episodes, uploads, and future formats. Filter by source_type, dataset_id, or library (slug). Results are newest-first. Use this in place of /api/episodes when your integration needs to work with multiple content types.

bash
GET /api/items
$
curl "https://www.audioscrape.com/api/items?library=lukas-library&source_type=podcast_episode&limit=20" \
     -H "Authorization: Bearer YOUR_API_KEY"
Response:
{
  "items": [
    {
      "id": 98765,
      "title": "Scaling Laws for Language Models",
      "slug": "scaling-laws-for-language-models",
      "source_type": "podcast_episode",
      "publish_date": "2026-05-14",
      "duration_seconds": 4280,
      "has_transcript": true,
      "podcast": {
        "id": 321,
        "title": "AI Research Today",
        "slug": "ai-research-today"
      },
      "dataset_id": 12
    }
  ],
  "total": 1,
  "limit": 20,
  "offset": 0
}

Query Parameters

Parameter Type Description
source_type string Filter by source type (podcast_episode, upload, youtube, ...). Omit for all types.
dataset_id integer Restrict to items in a specific dataset
library string Restrict to items in a specific library (by slug)
limit integer Max results (default 20, max 100)
offset integer Result offset for pagination

Item Fields

Field Type Description
id integer Stable numeric item ID
title string Item title
slug string URL slug
source_type string podcast_episode, upload, youtube, etc.
publish_date string | null Publish date in YYYY-MM-DD format
duration_seconds integer | null Item duration in seconds, when known
has_transcript boolean Whether a transcript is available for this item
podcast object | null Reference to the parent podcast (id, title, slug) when source_type = podcast_episode
dataset_id integer | null ID of the dataset this item belongs to, if any

Quota Note

Every Libraries / Datasets / Items request counts against your data-call quota. See the pricing page for per-plan limits.

Get Item

Returns one item plus a reference to its parent podcast (when applicable). Heavy fields are opt-in via ?include= — pass transcript, entities, or both to load them in the same response. This avoids paying the cost of a 500 KB+ transcript when you only need metadata. Equivalent to GET /api/episodes/{id} for podcast episodes, but works uniformly across all source types.

bash
GET /api/items/{item_id}
$
curl "https://www.audioscrape.com/api/items/98765?include=transcript,entities" \
     -H "Authorization: Bearer YOUR_API_KEY"
Response:
{
  "episode": {
    "id": 98765,
    "title": "Scaling Laws for Language Models",
    "slug": "scaling-laws-for-language-models",
    "description": "A deep dive into compute-optimal training...",
    "publish_date": "2026-05-14",
    "duration_seconds": 4280,
    "enclosure_url": "https://traffic.example.com/episode-98765.mp3"
  },
  "podcast": {
    "id": 321,
    "title": "AI Research Today",
    "slug": "ai-research-today",
    "image_url": "https://images.audioscrape.com/podcasts/321.jpg"
  },
  "transcript": {
    "segments": [
      { "text": "Welcome back to the show.", "start": 0.0, "end": 2.4, "speaker": "SPEAKER_00" }
    ],
    "speakers": ["SPEAKER_00", "SPEAKER_01"],
    "total_segments": 1842
  },
  "entities": [
    { "name": "OpenAI", "entity_type": "organization", "slug": "openai", "mention_count": 12 }
  ]
}

Path & Query Parameters

Parameter In Type Description
item_id path string Numeric item ID (works for any source_type)
include query string Comma-separated optional fields: transcript, entities. Omit for metadata-only.

Get Item Transcript

Returns the transcript on its own — segments plus speaker list — for any item, regardless of source type. Use this when you already have item metadata and only need the transcript payload.

bash
GET /api/items/{item_id}/transcript
$
curl "https://www.audioscrape.com/api/items/98765/transcript" \
     -H "Authorization: Bearer YOUR_API_KEY"
Response:
{
  "segments": [
    { "text": "Welcome back to the show.", "start": 0.0,  "end": 2.4,  "speaker": "SPEAKER_00" },
    { "text": "Today we’re talking about scaling laws.", "start": 2.5, "end": 5.1, "speaker": "SPEAKER_00" }
  ],
  "speakers": ["SPEAKER_00", "SPEAKER_01"],
  "total_segments": 1842
}

Path Parameters

Parameter Type Description
item_id string Numeric item ID

Segment Fields

Field Type Description
text string Transcribed text for the segment
start number Segment start time in seconds
end number Segment end time in seconds
speaker string | null Diarized speaker label, when available