Audio and Video Transcript (OpenAI Whisper) avatar

Audio and Video Transcript (OpenAI Whisper)

Pricing

$4.99/month + usage

Go to Apify Store
Audio and Video Transcript (OpenAI Whisper)

Audio and Video Transcript (OpenAI Whisper)

This Actor transcribes audio or video files from publicly accessible URLs using OpenAI's Whisper API. To use this Actor, you'll need to provide your own OpenAI API key. It supports multiple languages and highly customizable parameters, enabling precise control over the transcription process.

Pricing

$4.99/month + usage

Rating

1.8

(2)

Developer

Vít Tuhý

Vít Tuhý

Maintained by Community

Actor stats

5

Bookmarked

88

Total users

1

Monthly active users

10 days ago

Last modified

Share

Audio and Video Transcript

This Apify Actor transcribes audio or video files from publicly accessible URLs using OpenAI's Whisper API. Bring your own OpenAI API key. The Actor downloads each file, sends it to Whisper, and pushes the resulting transcript to the Actor's Dataset.


🚀 Features

  • Automatic language detection or manual language specification (96 languages, mapped to ISO-639-1 codes for the API).
  • Process multiple audio/video URLs in a single run.
  • Output formats: text, json, verbose_json, srt, vtt.
  • Optional segment- or word-level timestamps (with verbose_json).
  • Pre-flight check enforces OpenAI's 25 MB file-size limit and aborts the download early if exceeded.
  • API key is never written to logs.

🔧 Input Configuration

ParameterDescriptionRequired
urlArray of objects with a url field pointing to a publicly accessible audio/video file (≤25 MB).
languageAudio language, or Auto-detect.
response_formattext (default), json, verbose_json, srt, or vtt.
timestamp_granularitynone (default), segment, or word. Requires response_format: "verbose_json".
temperatureSampling temperature as a string between "0.0" and "1.0". Default "0.0".
promptOptional text guiding style or correcting names/spellings. Must match the audio language.
openai_api_keyYour OpenAI API key. Marked secret in the Apify platform.

📥 Example Input

{
"url": [
{ "url": "https://github.com/vittuhy/samples/raw/refs/heads/main/sample-audio.mp3" }
],
"language": "English",
"response_format": "verbose_json",
"timestamp_granularity": "word",
"temperature": "0.0",
"prompt": "",
"openai_api_key": "YOUR_OPENAI_API_KEY"
}

📤 Output

Each processed URL pushes one record to the Actor's Dataset:

{
"url": "https://...mp3",
"language": "en",
"response_format": "verbose_json",
"timestamp_granularity": "word",
"transcript": { /* OpenAI Whisper response */ }
}

If a file fails to download or transcribe, the record includes an error field instead of transcript.

Datasets are accessible via the Apify Console, the Apify API, and integrations (Zapier, Make, Google Sheets, etc.).