Upload API

Push your own audio into Audioscrape — interviews, conference talks, internal recordings, customer calls — and have it transcribed, diarised, entity-extracted, and indexed alongside every podcast on the platform. One endpoint, two intake modes: send the file directly as multipart/form-data, or hand us a URL with application/json and we'll fetch it server-side.

Machine-readable spec: For the always-current OpenAPI definition, see the interactive Swagger UI or download /api/openapi.json.

The endpoint ¶

All uploads target a single route. The Content-Type header decides the intake mode: multipart/form-data for raw bytes, application/json for URL-fetch.

endpoint

POST /api/datasets/{id}/items

{id} is the numeric ID of an upload-type dataset you own (or are a member of via its parent workspace). Datasets created for podcast feeds will reject uploads — create a dataset with content_type: "uploads" first. The authenticated caller must be a workspace member; otherwise the API returns 404 (we deliberately don't disclose whether the dataset exists).

Multipart upload ¶

Use multipart when the audio lives on the machine making the request — browser file picker, a CLI script streaming a local file, a desktop client. The file bytes travel in the audio form field.

bash

POST /api/datasets/123/items · multipart/form-data

curl -X POST "https://www.audioscrape.com/api/datasets/123/items" \
     -H "Authorization: Bearer YOUR_API_KEY" \
     -F "audio=@/path/to/interview.mp3" \
     -F "title=Q3 customer interview — ACME Corp" \
     -F "description=30-min discovery call with the ACME product team"

Response:

{
  "success": true,
  "item_id": 84217,
  "message": "Audio file queued for transcription with priority 100"
}

Form fields

Field	Type	Required	Description
`audio`	file	yes	The audio file. Default formats: `mp3`, `wav`, `m4a`, `ogg`, `flac` (configurable per dataset).
`title`	text	optional	Display title. Defaults to the filename without extension if omitted.
`description`	text	optional	Free-form description, surfaced in the item record and in search snippets.

Datasets configured with require_metadata = true will reject uploads that omit either title or description.

JSON URL upload ¶

Use the JSON variant when the audio is already hosted somewhere on the public internet — an S3 presigned URL, a CDN object, your own static server. Audioscrape downloads it, stores a copy on its own object storage, and queues the same transcription pipeline. This mirrors what the transcribe_audio MCP tool does under the hood, so the two surfaces stay at parity.

bash

POST /api/datasets/123/items · application/json

curl -X POST "https://www.audioscrape.com/api/datasets/123/items" \
     -H "Authorization: Bearer YOUR_API_KEY" \
     -H "Content-Type: application/json" \
     -d '{
       "audio_url": "https://files.example.com/recording.mp3",
       "title": "Founder fireside — March 2026",
       "description": "Recording from our internal all-hands"
     }'

Response:

{
  "success": true,
  "item_id": 84218,
  "message": "Audio fetched from URL and queued for transcription"
}

JSON body

Field	Type	Required	Description
`audio_url`	string	yes	Publicly reachable `https://` URL. See SSRF rules below.
`title`	string	optional	Display title. Defaults to the last path segment of the URL if omitted.
`description`	string	optional	Free-form description stored on the item.

SSRF guards on `audio_url`

Because the server fetches whatever URL you provide, we apply strict filters to prevent the endpoint from being used as a reflector into internal infrastructure:

Scheme must be https — plain http and other schemes are rejected.
The hostname is resolved server-side; if any resolved IP is loopback, link-local, RFC 1918 private, broadcast, documentation, unspecified, or IPv6 unique-local / link-local, the request is rejected.
Fetch has a 120-second timeout and is capped at the dataset's effective file-size limit. Content-Length over the cap is rejected before the body is read.
HTTP status must be 2xx; non-success responses surface as an error to the caller.
The file extension (derived from the URL path) must match the dataset's allowed-formats list, same as multipart.

Heads up

The URL only needs to be reachable at the moment of upload. We store a copy on our own object storage immediately, so transcription, playback, and search keep working even if the source URL later expires or 404s.

File size and format limits ¶

The effective size cap is the smaller of the dataset's max_file_size_mb setting and the account-plan cap. Free-tier accounts are hard-capped at 200 MB per upload regardless of dataset config; paid plans default to the dataset setting (500 MB out of the box, configurable higher). Files exceeding the cap return HTTP 413.

Default accepted extensions: mp3, wav, m4a, ogg, flac. Datasets can override this list in their settings.

Transcription minutes consumed by uploads count against your plan's monthly quota. See /pricing for current plan limits.

After upload ¶

A successful upload returns immediately with an item_id. The actual processing happens asynchronously in the background:

Storage. For URL uploads, the server fetches the audio (SSRF-checked) and stores it on S3-compatible object storage. Multipart uploads go straight to storage.
Pipeline enqueue. An items row is created with status queued and a pipeline run is added to pipeline_queue. Datasets with auto_transcribe = false stop here.
Transcription. WhisperX (large-v3 with diarisation) runs on a dedicated GPU pool. Wall-clock time is roughly 1–3× real-time depending on queue depth.
Speaker identification, entity extraction, search indexing. The item then runs through the same downstream stages as podcast episodes — speaker labels, named-entity extraction (people, organisations, places, products), and indexing into the search backends.

Poll the item to see status and grab the transcript once it's ready:

bash

GET /api/items/{id}

curl "https://www.audioscrape.com/api/items/84217" \
     -H "Authorization: Bearer YOUR_API_KEY"

The item moves from queued through processing into the published state. Once published, the transcript and segments are available via the standard transcript and search endpoints, and the item appears in dataset listings under GET /api/datasets/{id}/items.

Response fields

Field	Type	Description
`success`	boolean	Always `true` on a 200 response.
`item_id`	integer	ID of the newly-created item. Use it with `GET /api/items/{id}`.
`message`	string	Human-readable status note (e.g. `"queued for transcription"`).

Error responses

Status	Meaning
`400`	Invalid request body or URL (bad JSON, non-https scheme, unparseable URL).
`401`	Missing or invalid API key.
`403`	Caller is not a member of the dataset's parent library, or transcription quota exhausted.
`404`	Dataset not found (also returned when caller lacks access — we don't disclose existence).
`413`	File exceeds the per-upload size cap on this plan.
`415`	Audio format not in the dataset's allowed-formats list.
`422`	Audio URL unreachable, returned non-2xx, or resolved to a forbidden internal address.

Libraries, Datasets & Items

Account (/me)

Upload API

The endpoint ¶

Multipart upload ¶

Form fields

JSON URL upload ¶

JSON body

SSRF guards on `audio_url`

Heads up

File size and format limits ¶

After upload ¶

Response fields

Error responses

Sign in to Audioscrape

Share this moment

Upload API

The endpoint ¶

Multipart upload ¶

Form fields

JSON URL upload ¶

JSON body

SSRF guards on audio_url

Heads up

File size and format limits ¶

After upload ¶

Response fields

Error responses

Sign in to Audioscrape

Share this moment

SSRF guards on `audio_url`