Pricing

from $0.01 / youtube transcript extraction

YouTube Transcript API - AI Training Data

Extract YouTube video transcripts optimized for AI and machine learning workflows. Features chunking for LLM context limits, SRT/VTT formats, and music symbol removal. Perfect for building training datasets, content analysis, and subtitle generation.

Pricing

from $0.01 / youtube transcript extraction

Rating

0.0

(0)

Developer

Tan Analytics

Actor stats

Bookmarked

Total users

Monthly active users

4 months ago

Last modified

YouTube Transcript Extractor - AI Training Data

Extract YouTube video transcripts optimized for AI and machine learning workflows.

Why This Actor?

Feature	Free Tools	This Actor
AI chunking	❌	✅ Split by token limit
Token counting	❌	✅ Estimated tokens
Clean transcripts	❌	✅ Remove ♪ [music]
SRT/VTT formats	❌	✅ All included
Video metadata	❌	✅ Title, author, thumbnail
Affordable	Limited	$0.01 per video

Use Cases

AI Training Data

Build high-quality training datasets from YouTube videos. Chunked transcripts fit any LLM context window.

Content Analysis

Analyze video content. Get word counts, token estimates, and structured metadata.

Subtitle Generation

Export transcripts in SRT or VTT format for video editing, captions, or accessibility.

Academic Research

Extract lectures, interviews, and documentaries. Clean transcripts ready for analysis.

Features

🎯 AI-Optimized Output

Smart chunking - Split transcripts to fit your LLM's context window
Token estimation - Know exactly how many tokens you're working with
Clean mode - Remove music symbols (♪), [applause], [laughter] for cleaner training data

📄 Multiple Formats

Plain text - Raw transcript
SRT subtitles - For video editors
VTT subtitles - Web-compatible
Timestamps - Optional [MM:SS] markers

📊 Metadata Enrichment

Video title and author
Thumbnail URL
Duration (formatted and raw)
Word count and character count
Detected language

🔒 Reliability

Automatic proxy fallback (Direct → Datacenter → Residential)
YouTube Shorts support
Multi-language transcripts

Pricing

$0.01 per transcript extraction

Videos	Cost
10	$0.10
100	$1.00
1,000	$10.00

No monthly commitment. Pay only for what you use.

Quick Start

Input

{
  "videoUrl": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
  "language": "en",
  "chunkSize": 2000,
  "cleanTranscript": true,
  "outputFormat": "text"
}

Output

{
  "videoUrl": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
  "videoId": "dQw4w9WgXcQ",
  "transcript": "♪ We're no strangers to love ♪",
  "transcriptClean": "We're no strangers to love",
  "chunks": [
    {
      "id": 0,
      "text": "We're no strangers to love...",
      "start": 1.36,
      "end": 110.0,
      "wordCount": 230
    }
  ],
  "metadata": {
    "title": "Rick Astley - Never Gonna Give You Up",
    "author": "Rick Astley",
    "thumbnailUrl": "https://img.youtube.com/vi/dQw4w9WgXcQ/maxresdefault.jpg",
    "duration": 211.32,
    "durationFormatted": "03:31",
    "wordCount": 367,
    "estimatedTokens": 488,
    "language": "en"
  },
  "transcriptSRT": "1\n00:00:01,360 --> 00:00:03,040\n♪ We're no strangers to love ♪",
  "transcriptVTT": "WEBVTT\n\n00:00:01.360 --> 00:00:03.040\n♪ We're no strangers to love ♪"
}

Input Parameters

Parameter	Type	Default	Description
`videoUrl`	string	required	YouTube video URL
`language`	string	"en"	Preferred transcript language
`chunkSize`	integer	2000	Max chars per chunk (0 = off)
`cleanTranscript`	boolean	false	Remove music symbols and filler
`includeMetadata`	boolean	true	Include video metadata
`outputFormat`	string	"text"	Format: text, srt, or vtt
`includeTimestamps`	boolean	true	Add [MM:SS] timestamps

Supported URLs

https://www.youtube.com/watch?v=VIDEO_ID
https://youtu.be/VIDEO_ID
https://www.youtube.com/shorts/VIDEO_ID

FAQ

Q: What if a video has no transcript? A: The actor will return an error for that video.

Q: Can I extract transcripts in other languages? A: Yes. Set language to the ISO code (e.g., "es" for Spanish).

Q: What's the maximum chunk size? A: Default is 2000 characters (~500 tokens). Set to 0 to disable chunking.

Q: How accurate is token estimation? A: We use ~1.33 tokens per word as a rough estimate.

Support

Open an issue on GitHub or contact for enterprise pricing on large volumes.

$0.01 per transcript | Try it now on Apify

YouTube Transcript Scraper

azlan_lionheart/youtube-transcript-scraper

Extract transcripts from YouTube videos with timestamps. Supports batch processing, multiple languages, and output formats (text, JSON, SRT, VTT). Perfect for content creators, researchers, developers, and AI training.

Iqbal Fauzy Amrullah

YouTube Transcript API - AI Training Data (Batch)

app.tanalytics/youtube-transcript-batch

Batch extract YouTube transcripts at scale. Process thousands of videos in parallel with AI-optimized output. Smart chunking, token estimation, SRT/VTT export. $10 per 1K.

Tan Analytics

YouTube Transcript Enhanced

automation-lab/youtube-transcript-enhanced

Extract YouTube transcripts with SRT/VTT subtitle export, paragraph chunking, keyword search, time range filtering, and text analytics. Works with any public video.

Stas Persiianenko

1.0

Youtube Transcript Scraper

thedoor/youtube-transcript-scraper

Extract full YouTube transcripts instantly. Bulk video support, precise timestamps, and multiple export formats (CSV, Excel, JSON). Perfect for AI training, SEO, and content analysis.

TheDoor

5.0

YouTube Transcript Scraper Pro - Bulk + AI-Ready

wetyr_corporation/youtube-transcript-scraper-pro

Bulk extract YouTube transcripts in 100+ languages. 4 output formats (text, JSON, SRT, VTT), built-in translation, video metadata, residential proxies. AI/RAG-ready output for LLM training, content repurposing, and SEO research.

WETYR

YouTube Transcript Extractor — AI-Ready Subtitles

wsgcjj/youtube-transcript

Extracts subtitles/transcripts from YouTube videos. Input a video URL or ID, get clean text output with metadata. Ideal for AI training data collection, content analysis, and LLM training pipelines.

陈俊杰

YouTube Transcript Scraper PRO

intelscrape/youtube-transcript-scraper-pro

YouTube Transcript Scraper PRO, YouTube transcript scraper, get YouTube transcripts, download YouTube captions, extract subtitles, YouTube comments scraper, YouTube video text extractor, YouTube API alternative, LLM training data, datasets, Whisper AI transcription, scrape YouTube transcripts

IntelScrape

YouTube Subtitle Scraper – Download Captions & Transcripts

datascoutapi/youtube-subtitle-scraper

Extract YouTube subtitles and auto-generated captions in seconds – no YouTube API key required. Get clean, structured subtitle text ready for analysis, SEO optimization, content repurposing, academic research, or building AI/training datasets.

halam

YouTube Subtitle Extractor

entertained_rattlesnake/youtube-subtitle-extractor

Extract subtitles and transcripts from YouTube videos and export them as JSON, TXT, SRT and VTT.

Entertained Rattlesnake

Youtube Video Subtitles Scraper

simpleapi/youtube-video-subtitles-scraper

YouTube Video Subtitles Scraper extracts captions and subtitle tracks from YouTube videos in multiple languages. Returns timed transcripts, language codes, and download formats (SRT, VTT, TXT). Ideal for accessibility, translation, research, SEO, and automating transcript content analysis workflows