Search API
The Search API provides full-text and semantic search across millions of podcast transcripts. Search by keywords, filter by podcasts or speakers, and get relevant segments with timestamps.
Machine-readable spec: For the always-current OpenAPI definition, see the interactive Swagger UI or download /api/openapi.json.
Endpoint
Request Parameters
query
required
string
The search query. Supports boolean operators (AND, OR, NOT), exact phrases ("quoted"), exclusions (-term), and wildcards (term*).
search_type
string
Type of search to perform:
"text"- Keyword matching with fuzzy search"semantic"- AI-powered conceptual search"hybrid"- Combined text + semantic (default)
filters
object
Filter search results:
podcasts- Array of podcast IDsspeakers- Array of speaker namesdate_range.start- ISO 8601 date (e.g., "2024-01-01")date_range.end- ISO 8601 date
pagination
object
Pagination options:
limit- Results per page (default: 20, max: 100)offset- Starting position (default: 0)
sort
object
Sorting options:
field- "relevance" or "date"order- "asc" or "desc"
options
object
Additional options:
include_highlights- Highlight matching text (default: true)include_context- Include surrounding segmentsinclude_aggregations- Include faceted counts
Code Examples
cURL
curl -X POST "https://www.audioscrape.com/api/search" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "artificial intelligence ethics",
"search_type": "hybrid",
"filters": {
"date_range": {
"start": "2024-01-01"
}
},
"pagination": {
"limit": 10
},
"options": {
"include_highlights": true
}
}'
Python
import requests
response = requests.post(
"https://www.audioscrape.com/api/search",
headers={
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
},
json={
"query": "artificial intelligence ethics",
"search_type": "hybrid",
"filters": {
"date_range": {"start": "2024-01-01"}
},
"pagination": {"limit": 10},
"options": {"include_highlights": True}
}
)
data = response.json()
for result in data["results"]:
print(f"{result['episode']['title']}")
print(f" {result['segment']['text'][:100]}...")
print(f" {result['urls']['segment']}")
JavaScript / Node.js
const response = await fetch("https://www.audioscrape.com/api/search", {
method: "POST",
headers: {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
},
body: JSON.stringify({
query: "artificial intelligence ethics",
search_type: "hybrid",
filters: {
date_range: { start: "2024-01-01" }
},
pagination: { limit: 10 },
options: { include_highlights: true }
})
});
const data = await response.json();
data.results.forEach(result => {
console.log(result.episode.title);
console.log(` ${result.segment.text.slice(0, 100)}...`);
console.log(` ${result.urls.segment}`);
});
Response Format
{
"meta": {
"query": "artificial intelligence ethics",
"search_type": "hybrid",
"total_results": 1247,
"execution_time_ms": 45,
"pagination": {
"limit": 10,
"offset": 0,
"total_pages": 125,
"has_next": true,
"has_previous": false
}
},
"results": [
{
"id": "seg_abc123",
"relevance_score": 0.95,
"segment": {
"text": "The ethics of artificial intelligence is one of the most...",
"highlighted_text": "The ethics of <em>artificial intelligence</em> is one...",
"start_timestamp": "00:15:32",
"end_timestamp": "00:16:45",
"duration_seconds": 73,
"speaker": {
"name": "Dr. Jane Smith",
"slug": "dr-jane-smith"
},
"entities": [
{"name": "Artificial Intelligence", "entity_type": "technology", "slug": "artificial-intelligence"}
]
},
"episode": {
"id": "ep_xyz789",
"title": "The Future of AI Ethics",
"slug": "the-future-of-ai-ethics",
"description": "A deep dive into...",
"publish_date": "2024-06-15",
"duration_seconds": 3600
},
"podcast": {
"id": "pod_123",
"title": "Tech Talk Daily",
"slug": "tech-talk-daily",
"image_url": "https://...",
"publisher": "Tech Media Inc",
"category": "Technology"
},
"urls": {
"segment": "https://www.audioscrape.com/episode/tech-talk-daily/the-future-of-ai-ethics#t=932",
"episode": "https://www.audioscrape.com/episode/tech-talk-daily/the-future-of-ai-ethics",
"audio": "https://www.audioscrape.com/media/ep_xyz789/audio.mp3"
}
}
],
"aggregations": {
"podcasts": {
"buckets": [
{"key": "Tech Talk Daily", "doc_count": 45},
{"key": "AI Weekly", "doc_count": 32}
]
},
"speakers": {
"buckets": [
{"key": "Dr. Jane Smith", "doc_count": 28}
]
},
"time_distribution": {
"buckets": [
{"key": "2024-06", "doc_count": 156}
]
}
}
}
Error Responses
Invalid request parameters
{"error": "Invalid query", "message": "Query cannot be empty"}
Missing or invalid API key
{"error": "Unauthorized", "message": "Invalid API key"}
Rate limit exceeded
{"error": "Rate limit exceeded", "retry_after": 60}
Server-side error
{"error": "Internal error", "message": "Please try again later"}
Rate Limits
The Search API has the following rate limits:
- • 30 requests per minute per IP address
- • Burst allowance: 5 additional requests
Rate limit headers are included in responses: X-RateLimit-Remaining, X-RateLimit-Reset
Need Higher Limits?
Contact us at [email protected] for custom enterprise rate limits.