Pricing

from $20.00 / 1,000 transcribed minutes

Try for free

Go to Apify Store

Video & Audio Transcriber — Word-Level + SRT/VTT

Try for free

Transcribes any public video or audio URL (mp4, mov, mp3, wav, m4a, webm) into text with word-level and segment timestamps, plus downloadable SRT, VTT, and TXT files. Auto-detects language. Main use: generate subtitle files from a video.

Pricing

from $20.00 / 1,000 transcribed minutes

Rating

5.0

(2)

Developer

Dami's Studio

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

Video & Audio Transcriber

Give it a public video or audio URL and it returns accurate text with segment and word-level timestamps, plus ready-to-use SRT, VTT, and TXT files. It detects the spoken language automatically. Built for people who need captions, searchable transcripts, or source text to repurpose into clips, articles, or show notes.

How it works

The actor downloads your media, extracts the audio track with ffmpeg, and sends it to OpenAI's Whisper on your own API key. The timestamps and subtitle files come straight from the model's segment and word data, so timing lines up with the actual speech.

Input

Field	Required	Notes
`mediaUrl`	yes	Public URL to a video or audio file (mp4, mov, mp3, wav, m4a, webm, and similar).
`language`	no	ISO code of the spoken language, or `auto` to detect it. Defaults to `auto`.
`wordTimestamps`	no	Return per-word start/end times. Useful for karaoke-style captions. On by default.
`outputFormats`	no	Which files to generate: any of `srt`, `vtt`, `txt`. Defaults to `srt` and `vtt`.
`openaiApiKey`	yes	Your OpenAI (Whisper) key. Kept private and used only for this run.

There are two advanced fields if you need them: model (defaults to whisper-1) and baseUrl for an OpenAI-compatible endpoint.

Output

One dataset record per run. It includes the detected language, the full text, segments with start/end times, and words when word timestamps are enabled, along with wordCount, segmentCount, and durationSeconds. Each requested subtitle file is saved to the key-value store and referenced by srtKey/srtUrl, vttKey/vttUrl, and txtKey/txtUrl.

Example

{
  "mediaUrl": "https://example.com/podcast.mp3",
  "language": "auto",
  "wordTimestamps": true,
  "outputFormats": ["srt", "vtt", "txt"],
  "openaiApiKey": "sk-..."
}

Pricing

$0.04 per minute of audio, pay per result, no subscription. You bring your own OpenAI key, so Whisper usage is billed by OpenAI separately.

Notes

The mediaUrl has to be directly downloadable. Pages that require login or stream behind a player won't work, so point it at the raw file. Long files take longer and cost more since billing is per minute of audio.

Audio & Video to Text

donjuan_mime/audio-video-to-text

Transcribes video and audio files into plain text and subtitle formats (TXT, SRT, VTT, TSV, JSON) using OpenAI's Whisper model. Supports preloaded tiny, base, and small models.

Donjuan

YouTube Transcript Scraper — Batch + SRT/VTT Export

vanity_arias/youtube-transcript-scraper-batch

Extract YouTube video transcripts in bulk — paste video URLs, IDs, or Shorts links and get clean text, timestamped segments, and ready-to-use SRT/VTT subtitle files. No API key, failed videos never charged.

Nvikelo Nyathi

$0.15/min REAL YouTube Transcriber & Subtitles (JSON/SRT/VTT)

practicaltools/apify-youtube-transcribe

Download and transcribe YouTube videos into text and subtitle files – quickly, locally, and without external APIs. This Apify actor Faster-Whisper to generate transcripts and captions. It saves results in TXT, JSON, SRT, and VTT formats, plus provides a summary in the Dataset.

Practical Tools

5.0

Mp4 To Mp3

talkbot/mp4-to-mp3

Video To Sound Convertor

Ali Hashemi

Subtitle Translator — SRT & VTT

dami_studio/subtitle-translator

Translate subtitles into many languages at once. Paste an SRT/VTT file (or give a video URL to auto-transcribe), pick target languages, and get clean translated SRT + VTT back — timings preserved. For localization, accessibility, and multi-language publishing.

Dami's Studio

Video & Audio Transcriber · Whisper Speech-to-Text

memo23/video-audio-transcriber

Transcribe any video or audio URL to text with Whisper running inside the Actor — no API key. TikTok, YouTube, Instagram, Facebook, X, Rumble, podcast RSS feeds & direct files. Full text, timestamped segments, SRT + VTT subtitles, 99+ languages auto-detected. One flat rate for video and audio.

Muhamed Didovic

Transcribe Video to Text & Audio to Text — 99+ Languages

sian.agency/INCREDIBLY-FAST-audio-transcriber

Transcribe video to text and audio to text in bulk on Apify. 99+ languages, word-level timestamps, speaker diarization, SRT/VTT export. Try free.

SIÁN OÜ

133

5.0

Video & Audio to Translated Subtitles

lumaxys/professional-subtitle-formatter

Turn audio and video into accurate, professionally formatted subtitles. Automatically transcribe, translate into multiple languages, and export ready-to-use SRT, VTT, TXT, and JSON files. No API keys required.

François Fernandez

Large Video to Transcript

esteemed_chimta/large-video-to-transcript

Convert large video and audio files into speaker-labeled transcript bundles with TXT, JSON, SRT, VTT, and quality reports.

Will Pulier

Dailymotion Transcript Scraper — Subtitles to TXT, SRT, VTT

scrapersdelight/dailymotion-transcript-scraper

Extract any public Dailymotion video's subtitle transcript — no login, no ASR. By video URL/ID or a search query: full text, timestamped segments & SRT/VTT, plus title, owner and duration, from Dailymotion's own subtitle tracks. $2 per 1,000 videos.