Pricing

$3.00 / 1,000 videos

YouTube Transcript Extractor & Caption Downloader

Extract YouTube video transcripts with timestamps, multi-language fallback, and token-efficient JSON output. Built for AI pipelines, content analysis, and accessibility.

Pricing

$3.00 / 1,000 videos

Rating

0.0

(0)

Developer

Vnx0

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

YouTube Transcript Extractor — Download SRT Subtitles & Timestamped Captions

Extract YouTube video transcripts with precise timestamps, multi-language support, automatic proxy fallback, and downloadable SRT subtitle files. Built for AI pipelines, content analysis, accessibility workflows, and video editing — no API key required, zero browser automation, and error-resilient execution that never crashes on missing captions.

Try it live on Apify Console or call it via API for seamless integration into your data stack.

Features

SRT subtitle download — Generate ready-to-use .srt files compatible with VLC, Premiere Pro, DaVinci Resolve, and YouTube Studio
Timestamped JSON output — Every caption segment with start time (seconds) and duration for precise alignment
Multi-language support — Any ISO 639-1 code (en, fr, de, ja, es, pt, ar, hi) with automatic regional variant fallback
Auto proxy fallback — Bypasses YouTube IP blocks automatically using Apify Proxy with Chrome TLS impersonation
100% error resilience — Never crashes. Missing captions, disabled transcripts, and unavailable videos all produce clean error rows
LLM-optimized — Minimal, token-efficient JSON with no bloat. Ready for GPT, Claude, Llama, LangChain, and LlamaIndex
Zero browser — Pure HTTP extraction. No Playwright, no Puppeteer, no headless Chrome. Fast and cheap

Why Extract YouTube Transcripts?

Video content is exploding, but text is what machines can process. A YouTube transcript extractor turns hours of spoken video into structured, searchable text data in seconds. Common use cases include:

AI & LLM training data — Feed clean, timestamped transcripts into GPT, Claude, or Llama for fine-tuning or RAG pipelines
Content repurposing — Convert YouTube videos into blog posts, newsletter content, or social media threads without manual transcription
SEO content optimization — Index video transcripts on your site to rank for spoken keywords and capture voice-search traffic
Accessibility compliance — Generate captions and SRT subtitles for hearing-impaired audiences or multilingual localization
Academic research — Analyze lecture series, conference talks, or documentary content at scale with structured text output
Media monitoring — Track competitor YouTube channels by extracting and analyzing transcript text for brand mentions and topic trends

How to Extract YouTube Transcripts

Step 1: Get a YouTube Video URL

Copy the URL of any public YouTube video. This actor supports all standard URL formats:

https://www.youtube.com/watch?v=dQw4w9WgXcQ
https://youtu.be/dQw4w9WgXcQ
https://www.youtube.com/embed/dQw4w9WgXcQ
https://www.youtube.com/shorts/abc123def45

Step 2: Configure the Input

Only one field is required — the video URL. Optional settings let you control the language:

Parameter	Type	Default	Description
`youtubeUrl`	string	—	URL or ID of the YouTube video (watch, short, embed, or youtu.be formats)
`language`	string	`"en"`	ISO 639-1 language code (e.g. `"en"`, `"fr"`, `"de"`, `"ja"`, `"es"`, `"pt"`)

Step 3: Run the Actor

Start the run in Apify Console or via API. The actor fetches the transcript, segments it by caption timing, generates an SRT file, and pushes structured JSON to the dataset. Each run produces:

Transcript segments — One row per caption snippet with segment index, start time, duration, and text
Downloadable SRT file — SubRip subtitle format stored in the key-value store
Summary row — Total duration and download link
Run statistics — Segment count, detected language, video ID, and duration stored as JSON

Output Format

The actor pushes transcript segments and a summary row to the dataset, plus stores an SRT subtitle file in the key-value store.

Transcript Segment Row

{
  "segmentIndex": 1,
  "startTime": 5.51,
  "duration": 3.24,
  "text": "Hey..",
  "srtFile": "https://api.apify.com/v2/key-value-stores/{storeId}/records/subtitles.srt"
}

Summary Row

{
  "text": "YouTube transcript for video dQw4w9WgXcQ: 61 segments over 177.96s",
  "duration": 177.96,
  "srtFile": "https://api.apify.com/v2/key-value-stores/{storeId}/records/subtitles.srt"
}

SRT Subtitle File (key-value store)

SRT format with sequential numbering and HH:MM:SS,mmm timestamps:

1
00:00:05,509 --> 00:00:08,749
Hey..

2
00:00:08,830 --> 00:00:10,589
What happened?

Run Statistics (key-value store)

{
  "videoId": "dQw4w9WgXcQ",
  "segmentCount": 61,
  "totalDuration": 177.96,
  "language": "en",
  "srtFile": "https://api.apify.com/v2/key-value-stores/{storeId}/records/subtitles.srt"
}

Field Reference

Field	Type	Appears In	Description
`segmentIndex`	integer	Segment row	Sequential segment number starting at 1
`startTime`	number	Segment row	Start timestamp in seconds from video start
`duration`	number	Segment row	Segment duration in seconds
`text`	string	Segment row	Caption text for this segment
`srtFile`	string	Both rows	Download URL for the SRT subtitle file
`videoId`	string	Stats JSON	11-character YouTube video identifier
`segmentCount`	integer	Stats JSON	Total number of transcript segments
`totalDuration`	number	Summary + Stats	Total transcript duration in seconds

Use Cases for AI & LLM Pipelines

This actor is designed with AI consumption as a first-class use case:

Token-efficient fields — No raw HTML, debug hashes, or redundant metadata. Every field serves a purpose, keeping your token budget low.
Structured timestamps — startTime and duration in seconds let you align transcript text with video frames for multimodal AI applications.
Deterministic output — Same video, same language, same JSON structure every time. No stochastic HTML parsing — the YouTube Transcript API provides consistent, typed data.
Language fallback chain — Request "en", get "en-GB" if that's what the video has. Request any language, and the actor automatically finds regional variants, then falls back to the first available transcript. Never silently fail.
Error-resilient execution — Missing captions, disabled transcripts, and unavailable videos all produce clean error rows instead of crashing the run. Your pipeline never hangs.

Why Choose This Actor?

Feature	This Actor	Typical YouTube Transcript Actors
Error handling	Never crashes — pushes error row	Throws exception, run fails
Language fallback	Auto-finds regional variants	Strict language match or crash
Output structure	Minimal, token-efficient fields	Bloated with metadata
LLM-ready	Structured timestamps in seconds	Inconsistent formatting
Browser automation	None — HTTP-only, fast & cheap	Often uses Playwright/Puppeteer
API key required	No	Sometimes requires YouTube Data API key

Pricing

This actor uses pay-per-event pricing at $3 per 1,000 videos processed (approximately $0.003 per video). There are no monthly commitments, no API key costs, and no hidden infrastructure fees. You only pay for successful transcript extractions.

Price Comparison

Actor	Price per 1,000 videos
This actor	$3.00
starvibe/youtube-video-transcript	$5.00
pintostudio/youtube-transcript-scraper	$10.00

Frequently Asked Questions

How do I download SRT subtitles from YouTube?

Run this actor with any YouTube video URL. It automatically generates a downloadable .srt subtitle file with precise HH:MM:SS,mmm timestamps. The SRT file is stored in the key-value store and a direct download link is included in every dataset row. Compatible with VLC, Premiere Pro, DaVinci Resolve, and YouTube Studio.

Does this actor work for any YouTube video?

It works for any public YouTube video that has captions or subtitles enabled. Videos without any caption track (including auto-generated captions) return a clean error row. Private, unlisted, and age-restricted videos are not supported.

What languages are supported?

All languages that YouTube provides captions for. The actor accepts any ISO 639-1 language code ("en", "fr", "de", "ja", "es", "pt", etc.). If the exact code isn't available, it automatically falls back to regional variants (e.g. "en" → "en-GB") and then to any available transcript as a last resort.

How do I extract YouTube transcripts for AI and LLM pipelines?

The actor outputs clean, typed JSON with segmentIndex, startTime, duration, and text fields — no raw HTML, no debug hashes, no redundant metadata. Call the actor via the Apify API from Python or JavaScript and pipe the dataset directly into LangChain, LlamaIndex, CrewAI, or your custom RAG pipeline. Each field is designed to minimize token consumption.

How is this different from pintostudio/youtube-transcript-scraper or starvibe/youtube-video-transcript?

Those are excellent actors with large user bases. This actor differentiates on four dimensions: (1) SRT file generation — automatically generates downloadable SubRip subtitle files, not just JSON; (2) error resilience — never crashes on missing transcripts, pushing an error row instead; (3) language intelligence — automatically finds regional and available transcript variants instead of hard-failing on a strict language code; (4) proxy auto-fallback — automatically retries with Apify Proxy and Chrome TLS impersonation when YouTube blocks direct requests.

Is there a free tier or trial?

You can try the actor directly in Apify Console with the default input URL. Usage is billed per 1,000 videos at the published pay-per-event rate. Apify free plan includes $5 prepaid usage credit.

Technical Specifications

Spec	Value
Runtime	Python 3.14
Memory	1,024 MB
Average run time	1-15 seconds
Max timeout	3,600 seconds
Browser	None (pure HTTP)
TLS fingerprint	Chrome 131 impersonation
Proxy fallback	Apify Proxy AUTO pool
Output formats	JSON dataset + SRT subtitles

Support and Feedback

Found a bug or have a feature request? Reach out via the Apify Store Issues tab.

Youtube Transcript Scraper

scrapeengine/youtube-transcript-scraper

🎬 YouTube Transcript Scraper (youtube-transcript-scraper) pulls clean video transcripts/captions with timestamps, multi-language, and batch export (JSON/CSV). 🔎 Ideal for SEO, keyword research, summaries, accessibility, and content repurposing. ⚡ Fast, reliable, API-ready.

ScrapeEngine

Youtube Transcript

canadesk/youtube-transcript

Extract transcripts (with timestamps) from YouTube videos.

Canadesk Support

YouTube Transcript Scraper With AI Enrichment

scrapio/youtube-transcript-scraper

Scrapes transcripts from any YouTube video, capturing full text, timestamps, language, and metadata. Ideal for SEO research, content analysis, accessibility, subtitle extraction, and automated processing of large video libraries with accurate transcript output

Scrapio

Youtube Transcript Scraper

scraply/youtube-transcript-scraper

🎬 YouTube Transcript Scraper (youtube-transcript-scraper) quickly pulls video captions/transcripts — with timestamps, multi-language support & exports (TXT, SRT, JSON). 🔎 Ideal for SEO, content repurposing, research, subtitles & accessibility. ⚡ Fast, developer-friendly.

Scraply

Youtube Transcript Scraper

easyapi/youtube-transcript-scraper

Extract YouTube video transcripts and captions effortlessly using multiple transcript services. Perfect for content analysis, subtitles extraction, and video accessibility.

EasyApi

Youtube Transcript Scraper

scraper-mind/youtube-transcript-scraper

Extract YouTube video transcripts, captions & metadata instantly using our youtube transcript scraper. Supports all URL types, smart proxy fallback, multi-language detection & JSON output. Fast, reliable & only $5/run—perfect for creators, researchers, and marketers.

Scraper Mind

YouTube To Transcript

hexa-api/youtube-to-transcript

Extract YouTube transcripts from public video URLs

Hexa API

5.0

YouTube Transcript Scraper

happy_b/youtube-transcript-scraper

Extract YouTube video transcripts with timestamps, word counts, and full video metadata.

Happy B

5.0

YouTube Transcript Scraper - Bulk + Multi-language

dltik/youtube-transcript-scraper

Extract YouTube transcripts in bulk: any public video, manual + auto-generated captions, multi-language fallback. Outputs full text + segments with timestamps. HTTP-only, no API key. Pay $0.005/transcript.