Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Upload API

Push your own audio into Audioscrape — interviews, conference talks, internal recordings, customer calls — and have it transcribed, diarised, entity-extracted, and indexed alongside every podcast on the platform. One endpoint, two intake modes: send the file directly as multipart/form-data, or hand us a URL with application/json and we'll fetch it server-side.

Machine-readable spec: For the always-current OpenAPI definition, see the interactive Swagger UI or download /api/openapi.json.

The endpoint

All uploads target a single route. The Content-Type header decides the intake mode: multipart/form-data for raw bytes, application/json for URL-fetch.

endpoint
POST /api/datasets/{id}/items

{id} is the numeric ID of an upload-type dataset you own (or are a member of via its parent workspace). Datasets created for podcast feeds will reject uploads — create a dataset with content_type: "uploads" first. The authenticated caller must be a workspace member; otherwise the API returns 404 (we deliberately don't disclose whether the dataset exists).

Multipart upload

Use multipart when the audio lives on the machine making the request — browser file picker, a CLI script streaming a local file, a desktop client. The file bytes travel in the audio form field.

bash
POST /api/datasets/123/items · multipart/form-data
$
curl -X POST "https://www.audioscrape.com/api/datasets/123/items" \
     -H "Authorization: Bearer YOUR_API_KEY" \
     -F "audio=@/path/to/interview.mp3" \
     -F "title=Q3 customer interview — ACME Corp" \
     -F "description=30-min discovery call with the ACME product team"
Response:
{
  "success": true,
  "item_id": 84217,
  "message": "Audio file queued for transcription with priority 100"
}

Form fields

Field Type Required Description
audio file yes The audio file. Default formats: mp3, wav, m4a, ogg, flac (configurable per dataset).
title text optional Display title. Defaults to the filename without extension if omitted.
description text optional Free-form description, surfaced in the item record and in search snippets.

Datasets configured with require_metadata = true will reject uploads that omit either title or description.

JSON URL upload

Use the JSON variant when the audio is already hosted somewhere on the public internet — an S3 presigned URL, a CDN object, your own static server. Audioscrape downloads it, stores a copy on its own object storage, and queues the same transcription pipeline. This mirrors what the transcribe_audio MCP tool does under the hood, so the two surfaces stay at parity.

bash
POST /api/datasets/123/items · application/json
$
curl -X POST "https://www.audioscrape.com/api/datasets/123/items" \
     -H "Authorization: Bearer YOUR_API_KEY" \
     -H "Content-Type: application/json" \
     -d '{
       "audio_url": "https://files.example.com/recording.mp3",
       "title": "Founder fireside — March 2026",
       "description": "Recording from our internal all-hands"
     }'
Response:
{
  "success": true,
  "item_id": 84218,
  "message": "Audio fetched from URL and queued for transcription"
}

JSON body

Field Type Required Description
audio_url string yes Publicly reachable https:// URL. See SSRF rules below.
title string optional Display title. Defaults to the last path segment of the URL if omitted.
description string optional Free-form description stored on the item.

SSRF guards on audio_url

Because the server fetches whatever URL you provide, we apply strict filters to prevent the endpoint from being used as a reflector into internal infrastructure:

  • Scheme must be https — plain http and other schemes are rejected.
  • The hostname is resolved server-side; if any resolved IP is loopback, link-local, RFC 1918 private, broadcast, documentation, unspecified, or IPv6 unique-local / link-local, the request is rejected.
  • Fetch has a 120-second timeout and is capped at the dataset's effective file-size limit. Content-Length over the cap is rejected before the body is read.
  • HTTP status must be 2xx; non-success responses surface as an error to the caller.
  • The file extension (derived from the URL path) must match the dataset's allowed-formats list, same as multipart.

Heads up

The URL only needs to be reachable at the moment of upload. We store a copy on our own object storage immediately, so transcription, playback, and search keep working even if the source URL later expires or 404s.

File size and format limits

The effective size cap is the smaller of the dataset's max_file_size_mb setting and the account-plan cap. Free-tier accounts are hard-capped at 200 MB per upload regardless of dataset config; paid plans default to the dataset setting (500 MB out of the box, configurable higher). Files exceeding the cap return HTTP 413.

Default accepted extensions: mp3, wav, m4a, ogg, flac. Datasets can override this list in their settings.

Transcription minutes consumed by uploads count against your plan's monthly quota. See /pricing for current plan limits.

After upload

A successful upload returns immediately with an item_id. The actual processing happens asynchronously in the background:

  1. Storage. For URL uploads, the server fetches the audio (SSRF-checked) and stores it on S3-compatible object storage. Multipart uploads go straight to storage.
  2. Pipeline enqueue. An items row is created with status queued and a pipeline run is added to pipeline_queue. Datasets with auto_transcribe = false stop here.
  3. Transcription. WhisperX (large-v3 with diarisation) runs on a dedicated GPU pool. Wall-clock time is roughly 1–3× real-time depending on queue depth.
  4. Speaker identification, entity extraction, search indexing. The item then runs through the same downstream stages as podcast episodes — speaker labels, named-entity extraction (people, organisations, places, products), and indexing into the search backends.

Poll the item to see status and grab the transcript once it's ready:

bash
GET /api/items/{id}
$
curl "https://www.audioscrape.com/api/items/84217" \
     -H "Authorization: Bearer YOUR_API_KEY"

The item moves from queued through processing into the published state. Once published, the transcript and segments are available via the standard transcript and search endpoints, and the item appears in dataset listings under GET /api/datasets/{id}/items.

Response fields

Field Type Description
success boolean Always true on a 200 response.
item_id integer ID of the newly-created item. Use it with GET /api/items/{id}.
message string Human-readable status note (e.g. "queued for transcription").

Error responses

Status Meaning
400 Invalid request body or URL (bad JSON, non-https scheme, unparseable URL).
401 Missing or invalid API key.
403 Caller is not a member of the dataset's parent library, or transcription quota exhausted.
404 Dataset not found (also returned when caller lacks access — we don't disclose existence).
413 File exceeds the per-upload size cap on this plan.
415 Audio format not in the dataset's allowed-formats list.
422 Audio URL unreachable, returned non-2xx, or resolved to a forbidden internal address.