Pricing

Pay per event

Go to Store

Audio & Video to Text

Try for free

Developed by

Donjuan

Transcribes video and audio files into plain text and subtitle formats (TXT, SRT, VTT, TSV, JSON) using OpenAI's Whisper model. Supports preloaded tiny, base, and small models.

0.0 (0)

Pricing

Pay per event

Last modified

3 days ago

Social media

Developer tools

Open source

🎬 Video and Audio to Text Transcription

🧠 Overview

This script is designed for the Apify platform and uses OpenAI Whisper to transcribe audio or video (e.g., from YouTube or MP4 files) into text and other formats (SRT, VTT, etc.).

📥 Input

Parameters

model: (string) — Whisper model to use. Available options:
- tiny ✅ (pre-installed)
- base ✅ (pre-installed)
- small ✅ (pre-installed)
- medium (requires download)
- large (requires download)
- turbo (requires download)

✅ Note: Models tiny, base, and small are already downloaded in the Docker image for faster and offline-ready processing.

source_url: (string) — Direct URL to the video/audio file (e.g., an MP4 file hosted online).
⚠️ YouTube links are not supported directly. You must download the video first.

Example Input

{
  "model": "tiny",
  "source_url": "https://raw.githubusercontent.com/donjuanMime/audio_to_text/main/video.mp4"
}

📤 Output

The output is a JSON array with one object, which includes multiple transcription formats:

json: Full Whisper output with segments, tokens, and metadata.
srt: SubRip subtitle format.
tsv: Tab-separated values (start, end, text).
txt: Plain text transcription.
vtt: WebVTT subtitle format.

Example Output (excerpt)

[
  {
    "json": "{ ... Whisper segment data ... }",
    "srt": "1\n00:00:00,000 --> 00:00:01,120\nWhat's your favorite drink?\n...",
    "tsv": "start\tend\ttext\n0\t1120\tWhat's your favorite drink?\n...",
    "txt": "What's your favorite drink?\nMy favorite drink is apple juice...\n",
    "vtt": "WEBVTT\n\n00:00.000 --> 00:01.120\nWhat's your favorite drink?\n..."
  }
]

🛠️ How to Use

Go to your Apify dashboard and create a new actor or task.
Paste this script into the actor’s source.
Provide the input in the required JSON format (see above).
Run the actor. It will download the media file, process it using Whisper, and return transcription in multiple formats.

⚠️ Disclaimer

This script is provided "as is", without warranties of any kind. Use it at your own risk. Ensure compliance with:

YouTube’s Terms of Service (if downloading/transcribing from YouTube).
Local and international copyright laws.

Let me know if you’d like the actual Apify actor code or instructions on downloading YouTube videos as .mp4 files to use with this.

On this page

Share Actor:

Audio and Video Transcript (OpenAI Whisper)

vittuhy/audio-and-video-transcript

This Actor transcribes audio or video files from publicly accessible URLs using OpenAI's Whisper API. To use this Actor, you'll need to provide your own OpenAI API key. It supports multiple languages and highly customizable parameters, enabling precise control over the transcription process.

Vít Tuhý

1.2

Audio And Video Transcriber (OpenAI GPT-4o-transcribe)

stanvanrooy6/audio-video-transcriber

Downloads videos from public URLs, extracts audio, and transcribes them using OpenAI

Stan Van Rooy

5.0

Video to Text Transcription

aizen0/video-to-text-transcription

Convert video speech to text in bulk. Supports Only Twitter/Instagram, auto-detects languages, handles large files automatically. Uses OpenAI Whisper for high accuracy.

Pratham Yadav

Tiktok Video Transcirpt Using OpenAI Whisper API

linen_snack/tiktok-video-transcirpt-using-openai-whisper-api

This Apify actor uses the OpenAI Whisper API to either transcribe Tiktok video into its original language or translate it into English. It's built to be robust, automatically handling video-to-audio conversion and compression to stay within API limits.

ius iyb

Instagram reel transcript

linen_snack/instagram-videos-transcipt-subtitles-and-translate

Effortlessly convert any public Instagram reels videos into accurate text, subtitles, or translations with this powerful OpenAI Whisper API actor.

ius iyb

121

Twitter subtitles transcript

linen_snack/twitter-subtitles-transcript

Effortlessly convert any public Twitter/X video into accurate text, subtitles, or translations with this powerful OpenAI Whisper API actor.

ius iyb

Free Large Video Converter

lukaskrivka/audio-video-converter

Flexible and powerful conversion tool using the popular ffmpeg program ideal for very large video and audio files. Convert any audio or video file to a different format and adjust any settings. Automatically recognizes the source format.

Lukáš Křivka

124

Speech to Text Converter (Transcript / Captcha)

saswave/speech-to-text-converter

Transform audio records to text. Get transcription from sales or customer success teams audio files. Get Captcha text from captcha audio challenge. Speech to text converter helps you analyse, build KPI with audio records and bypass captcha.

SASWAVE

Video to Text Pro (12+ Languages)🔥

marketingme/video-to-text-pro-12-languages

🎬 Convert videos to text from 1000+ platforms. YouTube, TikTok, Twitter/X, Instagram... Supports 12+ languages: English, Chinese, Japanese, Korean, Spanish, French, German, Portuguese, Russian, Arabic, Hindi, Italian with translation capabilities.

MarketingMe

5.0

Text-to-Speech Generator (OpenAI voice generator)

stanvanrooy6/text-to-speech-generator-openai-voice-generator

Convert text to speech effortlessly with our OpenAI voice generator. Choose from 6 English-optimized voices, customize settings, and get high-quality audio files fast. Simple to use, integrates with your OpenAI API key.

Stan Van Rooy

5.0