Audio and Video Transcript (OpenAI Whisper)
Pricing
$4.99/month + usage
Audio and Video Transcript (OpenAI Whisper)
This Actor transcribes audio or video files from publicly accessible URLs using OpenAI's Whisper API. To use this Actor, you'll need to provide your own OpenAI API key. It supports multiple languages and highly customizable parameters, enabling precise control over the transcription process.
Pricing
$4.99/month + usage
Rating
1.8
(2)
Developer
Vít Tuhý
Maintained by CommunityActor stats
5
Bookmarked
88
Total users
1
Monthly active users
10 days ago
Last modified
Categories
Share
Audio and Video Transcript
This Apify Actor transcribes audio or video files from publicly accessible URLs using OpenAI's Whisper API. Bring your own OpenAI API key. The Actor downloads each file, sends it to Whisper, and pushes the resulting transcript to the Actor's Dataset.
🚀 Features
- Automatic language detection or manual language specification (96 languages, mapped to ISO-639-1 codes for the API).
- Process multiple audio/video URLs in a single run.
- Output formats:
text,json,verbose_json,srt,vtt. - Optional segment- or word-level timestamps (with
verbose_json). - Pre-flight check enforces OpenAI's 25 MB file-size limit and aborts the download early if exceeded.
- API key is never written to logs.
🔧 Input Configuration
| Parameter | Description | Required |
|---|---|---|
url | Array of objects with a url field pointing to a publicly accessible audio/video file (≤25 MB). | ✅ |
language | Audio language, or Auto-detect. | ❌ |
response_format | text (default), json, verbose_json, srt, or vtt. | ❌ |
timestamp_granularity | none (default), segment, or word. Requires response_format: "verbose_json". | ❌ |
temperature | Sampling temperature as a string between "0.0" and "1.0". Default "0.0". | ❌ |
prompt | Optional text guiding style or correcting names/spellings. Must match the audio language. | ❌ |
openai_api_key | Your OpenAI API key. Marked secret in the Apify platform. | ✅ |
📥 Example Input
{"url": [{ "url": "https://github.com/vittuhy/samples/raw/refs/heads/main/sample-audio.mp3" }],"language": "English","response_format": "verbose_json","timestamp_granularity": "word","temperature": "0.0","prompt": "","openai_api_key": "YOUR_OPENAI_API_KEY"}
📤 Output
Each processed URL pushes one record to the Actor's Dataset:
{"url": "https://...mp3","language": "en","response_format": "verbose_json","timestamp_granularity": "word","transcript": { /* OpenAI Whisper response */ }}
If a file fails to download or transcribe, the record includes an error field instead of transcript.
Datasets are accessible via the Apify Console, the Apify API, and integrations (Zapier, Make, Google Sheets, etc.).