Libraries, Datasets, and Items API

Audioscrape organizes every piece of audio under a single canonical hierarchy: Library → Dataset → Item. A library is a curated workspace; a dataset is a grouping of related content inside that library; an item is one piece of audio — a podcast episode, an uploaded talk, a hearing recording, or any future format.

Items are universal. Each one has a source_type field (podcast_episode, upload, youtube, ...) so the same endpoints work across formats. The older /api/podcasts and /api/episodes routes remain as convenience aliases for the most common case, but new integrations should prefer /api/items and /api/libraries — they scale across content types.

Machine-readable spec: For the always-current OpenAPI definition, see the interactive Swagger UI or download /api/openapi.json.

List Libraries ¶

Returns every public library on Audioscrape, ordered by subscriber count. The is_subscribed field on each entry tells you whether the calling user is already a member.

bash

GET /api/libraries

curl "https://www.audioscrape.com/api/libraries" \
     -H "Authorization: Bearer YOUR_API_KEY"

Response:

{
  "libraries": [
    {
      "slug": "lukas-library",
      "name": "Lukas’ Library",
      "description": "Tech, AI, and startup podcasts I follow.",
      "curator": "Lukas Schmyrczyk",
      "subscriber_count": 142,
      "dataset_count": 7,
      "episode_count": 1834,
      "is_subscribed": true
    }
  ],
  "total": 1
}

Response Fields

Field	Type	Description
slug	string	Stable URL slug, e.g. `lukas-library`
name	string	Human-readable library name
description	string \| null	Optional library description
curator	string \| null	Display name of the library owner
subscriber_count	integer	Number of users subscribed to the library
dataset_count	integer	Number of public datasets in the library
episode_count	integer	Sum of episodes across all public datasets
is_subscribed	boolean	Whether the calling user is a member of this library

Get Library ¶

Returns one library plus the datasets the calling user can see. Public datasets are always returned; workspace-scoped datasets are returned only to members. Private libraries return 404 to non-members.

bash

GET /api/libraries/{slug}

curl "https://www.audioscrape.com/api/libraries/lukas-library" \
     -H "Authorization: Bearer YOUR_API_KEY"

Response:

{
  "slug": "lukas-library",
  "name": "Lukas’ Library",
  "description": "Tech, AI, and startup podcasts I follow.",
  "curator": "Lukas Schmyrczyk",
  "subscriber_count": 142,
  "dataset_count": 7,
  "episode_count": 1834,
  "is_subscribed": true,
  "datasets": [
    {
      "id": 12,
      "slug": "ai-research",
      "name": "AI Research Podcasts",
      "description": "Long-form interviews with AI researchers.",
      "visibility": "public",
      "podcast_count": 14,
      "episode_count": 412
    }
  ]
}

Path Parameters

Parameter	Type	Description
slug	string	Library slug (e.g. `lukas-library`)

Dataset Fields

Field	Type	Description
id	integer	Numeric dataset ID (stable, used by `/api/datasets/{id}` and `/api/items?dataset_id=`)
slug	string	URL slug for the dataset
name	string	Human-readable dataset name
description	string \| null	Optional dataset description
visibility	string	One of `public`, `workspace`, or `private`
podcast_count	integer	Number of podcasts represented in the dataset
episode_count	integer	Total items in the dataset

Get Dataset ¶

Returns a dataset plus a reference to its parent library, in a single response. Saves a round-trip when you have a dataset ID from a search result or item record and need the surrounding context.

bash

GET /api/datasets/{id}

curl "https://www.audioscrape.com/api/datasets/12" \
     -H "Authorization: Bearer YOUR_API_KEY"

Response:

{
  "id": 12,
  "slug": "ai-research",
  "name": "AI Research Podcasts",
  "description": "Long-form interviews with AI researchers.",
  "visibility": "public",
  "podcast_count": 14,
  "episode_count": 412,
  "library_slug": "lukas-library",
  "library_name": "Lukas’ Library"
}

Path Parameters

Parameter	Type	Description
id	integer	Numeric dataset ID

List Items ¶

Universal listing across content types — podcast episodes, uploads, and future formats. Filter by source_type, dataset_id, or library (slug). Results are newest-first. Use this in place of /api/episodes when your integration needs to work with multiple content types.

bash

GET /api/items

curl "https://www.audioscrape.com/api/items?library=lukas-library&source_type=podcast_episode&limit=20" \
     -H "Authorization: Bearer YOUR_API_KEY"

Response:

{
  "items": [
    {
      "id": 98765,
      "title": "Scaling Laws for Language Models",
      "slug": "scaling-laws-for-language-models",
      "source_type": "podcast_episode",
      "publish_date": "2026-05-14",
      "duration_seconds": 4280,
      "has_transcript": true,
      "podcast": {
        "id": 321,
        "title": "AI Research Today",
        "slug": "ai-research-today"
      },
      "dataset_id": 12
    }
  ],
  "total": 1,
  "limit": 20,
  "offset": 0
}

Query Parameters

Parameter	Type	Description
source_type	string	Filter by source type (`podcast_episode`, `upload`, `youtube`, ...). Omit for all types.
dataset_id	integer	Restrict to items in a specific dataset
library	string	Restrict to items in a specific library (by slug)
limit	integer	Max results (default 20, max 100)
offset	integer	Result offset for pagination

Item Fields

Field	Type	Description
id	integer	Stable numeric item ID
title	string	Item title
slug	string	URL slug
source_type	string	`podcast_episode`, `upload`, `youtube`, etc.
publish_date	string \| null	Publish date in `YYYY-MM-DD` format
duration_seconds	integer \| null	Item duration in seconds, when known
has_transcript	boolean	Whether a transcript is available for this item
podcast	object \| null	Reference to the parent podcast (`id`, `title`, `slug`) when `source_type = podcast_episode`
dataset_id	integer \| null	ID of the dataset this item belongs to, if any

Quota Note

Every Libraries / Datasets / Items request counts against your data-call quota. See the pricing page for per-plan limits.

Get Item ¶

Returns one item plus a reference to its parent podcast (when applicable). Heavy fields are opt-in via ?include= — pass transcript, entities, or both to load them in the same response. This avoids paying the cost of a 500 KB+ transcript when you only need metadata. Equivalent to GET /api/episodes/{id} for podcast episodes, but works uniformly across all source types.

bash

GET /api/items/{item_id}

curl "https://www.audioscrape.com/api/items/98765?include=transcript,entities" \
     -H "Authorization: Bearer YOUR_API_KEY"

Response:

{
  "episode": {
    "id": 98765,
    "title": "Scaling Laws for Language Models",
    "slug": "scaling-laws-for-language-models",
    "description": "A deep dive into compute-optimal training...",
    "publish_date": "2026-05-14",
    "duration_seconds": 4280,
    "enclosure_url": "https://traffic.example.com/episode-98765.mp3"
  },
  "podcast": {
    "id": 321,
    "title": "AI Research Today",
    "slug": "ai-research-today",
    "image_url": "https://images.audioscrape.com/podcasts/321.jpg"
  },
  "transcript": {
    "segments": [
      { "text": "Welcome back to the show.", "start": 0.0, "end": 2.4, "speaker": "SPEAKER_00" }
    ],
    "speakers": ["SPEAKER_00", "SPEAKER_01"],
    "total_segments": 1842
  },
  "entities": [
    { "name": "OpenAI", "entity_type": "organization", "slug": "openai", "mention_count": 12 }
  ]
}

Path & Query Parameters

Parameter	In	Type	Description
item_id	path	string	Numeric item ID (works for any `source_type`)
include	query	string	Comma-separated optional fields: `transcript`, `entities`. Omit for metadata-only.

Get Item Transcript ¶

Returns the transcript on its own — segments plus speaker list — for any item, regardless of source type. Use this when you already have item metadata and only need the transcript payload.

bash

GET /api/items/{item_id}/transcript

curl "https://www.audioscrape.com/api/items/98765/transcript" \
     -H "Authorization: Bearer YOUR_API_KEY"

Response:

{
  "segments": [
    { "text": "Welcome back to the show.", "start": 0.0,  "end": 2.4,  "speaker": "SPEAKER_00" },
    { "text": "Today we’re talking about scaling laws.", "start": 2.5, "end": 5.1, "speaker": "SPEAKER_00" }
  ],
  "speakers": ["SPEAKER_00", "SPEAKER_01"],
  "total_segments": 1842
}

Path Parameters

Parameter	Type	Description
item_id	string	Numeric item ID

Segment Fields

Field	Type	Description
text	string	Transcribed text for the segment
start	number	Segment start time in seconds
end	number	Segment end time in seconds
speaker	string \| null	Diarized speaker label, when available

Search API

Upload Audio

Libraries, Datasets, and Items API

List Libraries ¶

Response Fields

Get Library ¶

Path Parameters

Dataset Fields

Get Dataset ¶

Path Parameters

List Items ¶

Query Parameters

Item Fields

Quota Note

Get Item ¶

Path & Query Parameters

Get Item Transcript ¶

Path Parameters

Segment Fields

Sign in to Audioscrape

Share this moment