AI Audio to Text Transcriber
Pricing
Pay per event
AI Audio to Text Transcriber
Transcribe audio files to text using OpenAI Whisper. Accepts public audio URLs (MP3, MP4, M4A, WAV, WEBM, OGG, FLAC) and returns full transcripts with language, duration, and timed segments. BYO OpenAI key required.
Pricing
Pay per event
Rating
0.0
(0)
Developer
BowTiedRaccoon
Maintained by CommunityActor stats
0
Bookmarked
1
Total users
0
Monthly active users
4 days ago
Last modified
Categories
Share
Transcribe audio files to text using OpenAI Whisper. Supply a list of public audio file URLs and your OpenAI API key — the actor downloads each file, sends it to the Whisper API, and returns a verbatim transcript alongside language detection, duration, and timed segments.
What it does
- Accepts a list of public audio file URLs (MP3, MP4, M4A, WAV, WEBM, OGG, FLAC)
- Downloads each file to temporary storage (max 25 MB per file — OpenAI limit)
- Transcribes via OpenAI Whisper (
whisper-1) withverbose_jsonoutput - Returns the full text transcript, detected language, audio duration, and segment-level timestamps
- Processes up to 3 files concurrently for faster batch runs
- Saves one dataset record per file, including error records for files that fail
Use cases
- Podcast indexing and search
- Meeting recording notes
- Compliance and call-center transcription
- Generating training data for NLP models
- Subtitles and captions for video content
- Multilingual content analysis
Input
| Field | Type | Required | Description |
|---|---|---|---|
audioUrls | Array | Yes | Public audio file URLs to transcribe |
openaiApiKey | String | Yes | Your OpenAI API key (sk-...). Not stored. |
language | String | No | ISO 639-1 hint (e.g. en, es, ja). Omit for auto-detect. |
maxItems | Integer | No | Maximum files to transcribe per run. Default: 15. |
Supported audio formats: MP3, MP4, M4A, WAV, WEBM, OGG, FLAC Max file size: 25 MB (OpenAI Whisper hard limit)
Example input
{"audioUrls": ["https://example.com/podcast-episode-1.mp3","https://example.com/meeting-recording.wav"],"openaiApiKey": "sk-...","language": "en","maxItems": 10}
Output
One dataset record per audio file.
| Field | Type | Description |
|---|---|---|
sourceUrl | String | Original audio file URL |
transcript | String | Full verbatim transcription text |
language | String | Detected language (e.g. english, spanish) |
durationSeconds | Number | Audio duration in seconds |
segments | String | JSON array of timed segments [{start, end, text}] |
model | String | Whisper model used (whisper-1) |
transcribedAt | String | ISO timestamp |
status | String | success or error |
errorMsg | String | Error description on failure, null on success |
Example output record
{"sourceUrl": "https://example.com/podcast-ep1.mp3","transcript": "Welcome to today's episode. Today we're discussing the future of AI...","language": "english","durationSeconds": 1823.4,"segments": "[{\"start\":0.0,\"end\":3.2,\"text\":\"Welcome to today's episode.\"}]","model": "whisper-1","transcribedAt": "2026-05-26T12:00:00Z","status": "success","errorMsg": null}
Requirements
- OpenAI API key — Bring your own key at
https://platform.openai.com/api-keys. Whisper pricing is approximately $0.006 per minute of audio (billed by OpenAI to your account). - Public audio URLs — Files must be publicly accessible without authentication.
Pricing
This actor charges $0.10 per start + $0.001 per file processed (including error records). OpenAI Whisper API costs are separate and billed directly to your OpenAI account.
Error handling
Files that fail to download or transcribe are not dropped — the actor saves an error record to the dataset with status: "error" and a descriptive errorMsg. This ensures your dataset always has one row per input URL for easy reconciliation.
Common errors:
HTTP 401— Invalid API keyHTTP 429— OpenAI rate limit exceeded (retry with fewer files or lower concurrency)File exceeds 25 MB limit— Source file too large for Whisper APIDownload timed out— URL not reachable within 60 seconds