YouTube Transcript Extractor & Caption Downloader avatar

YouTube Transcript Extractor & Caption Downloader

Pricing

$3.00 / 1,000 videos

Go to Apify Store
YouTube Transcript Extractor & Caption Downloader

YouTube Transcript Extractor & Caption Downloader

Extract YouTube video transcripts with timestamps, multi-language fallback, and token-efficient JSON output. Built for AI pipelines, content analysis, and accessibility.

Pricing

$3.00 / 1,000 videos

Rating

0.0

(0)

Developer

Vnx0

Vnx0

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Share

YouTube Transcript Extractor — Download SRT Subtitles & Timestamped Captions

Extract YouTube video transcripts with precise timestamps, multi-language support, automatic proxy fallback, and downloadable SRT subtitle files. Built for AI pipelines, content analysis, accessibility workflows, and video editing — no API key required, zero browser automation, and error-resilient execution that never crashes on missing captions.

Try it live on Apify Console or call it via API for seamless integration into your data stack.

Features

  • SRT subtitle download — Generate ready-to-use .srt files compatible with VLC, Premiere Pro, DaVinci Resolve, and YouTube Studio
  • Timestamped JSON output — Every caption segment with start time (seconds) and duration for precise alignment
  • Multi-language support — Any ISO 639-1 code (en, fr, de, ja, es, pt, ar, hi) with automatic regional variant fallback
  • Auto proxy fallback — Bypasses YouTube IP blocks automatically using Apify Proxy with Chrome TLS impersonation
  • 100% error resilience — Never crashes. Missing captions, disabled transcripts, and unavailable videos all produce clean error rows
  • LLM-optimized — Minimal, token-efficient JSON with no bloat. Ready for GPT, Claude, Llama, LangChain, and LlamaIndex
  • Zero browser — Pure HTTP extraction. No Playwright, no Puppeteer, no headless Chrome. Fast and cheap

Why Extract YouTube Transcripts?

Video content is exploding, but text is what machines can process. A YouTube transcript extractor turns hours of spoken video into structured, searchable text data in seconds. Common use cases include:

  • AI & LLM training data — Feed clean, timestamped transcripts into GPT, Claude, or Llama for fine-tuning or RAG pipelines
  • Content repurposing — Convert YouTube videos into blog posts, newsletter content, or social media threads without manual transcription
  • SEO content optimization — Index video transcripts on your site to rank for spoken keywords and capture voice-search traffic
  • Accessibility compliance — Generate captions and SRT subtitles for hearing-impaired audiences or multilingual localization
  • Academic research — Analyze lecture series, conference talks, or documentary content at scale with structured text output
  • Media monitoring — Track competitor YouTube channels by extracting and analyzing transcript text for brand mentions and topic trends

How to Extract YouTube Transcripts

Step 1: Get a YouTube Video URL

Copy the URL of any public YouTube video. This actor supports all standard URL formats:

  • https://www.youtube.com/watch?v=dQw4w9WgXcQ
  • https://youtu.be/dQw4w9WgXcQ
  • https://www.youtube.com/embed/dQw4w9WgXcQ
  • https://www.youtube.com/shorts/abc123def45

Step 2: Configure the Input

Only one field is required — the video URL. Optional settings let you control the language:

ParameterTypeDefaultDescription
youtubeUrlstringURL or ID of the YouTube video (watch, short, embed, or youtu.be formats)
languagestring"en"ISO 639-1 language code (e.g. "en", "fr", "de", "ja", "es", "pt")

Step 3: Run the Actor

Start the run in Apify Console or via API. The actor fetches the transcript, segments it by caption timing, generates an SRT file, and pushes structured JSON to the dataset. Each run produces:

  1. Transcript segments — One row per caption snippet with segment index, start time, duration, and text
  2. Downloadable SRT file — SubRip subtitle format stored in the key-value store
  3. Summary row — Total duration and download link
  4. Run statistics — Segment count, detected language, video ID, and duration stored as JSON

Output Format

The actor pushes transcript segments and a summary row to the dataset, plus stores an SRT subtitle file in the key-value store.

Transcript Segment Row

{
"segmentIndex": 1,
"startTime": 5.51,
"duration": 3.24,
"text": "Hey..",
"srtFile": "https://api.apify.com/v2/key-value-stores/{storeId}/records/subtitles.srt"
}

Summary Row

{
"text": "YouTube transcript for video dQw4w9WgXcQ: 61 segments over 177.96s",
"duration": 177.96,
"srtFile": "https://api.apify.com/v2/key-value-stores/{storeId}/records/subtitles.srt"
}

SRT Subtitle File (key-value store)

SRT format with sequential numbering and HH:MM:SS,mmm timestamps:

1
00:00:05,509 --> 00:00:08,749
Hey..
2
00:00:08,830 --> 00:00:10,589
What happened?

Run Statistics (key-value store)

{
"videoId": "dQw4w9WgXcQ",
"segmentCount": 61,
"totalDuration": 177.96,
"language": "en",
"srtFile": "https://api.apify.com/v2/key-value-stores/{storeId}/records/subtitles.srt"
}

Field Reference

FieldTypeAppears InDescription
segmentIndexintegerSegment rowSequential segment number starting at 1
startTimenumberSegment rowStart timestamp in seconds from video start
durationnumberSegment rowSegment duration in seconds
textstringSegment rowCaption text for this segment
srtFilestringBoth rowsDownload URL for the SRT subtitle file
videoIdstringStats JSON11-character YouTube video identifier
segmentCountintegerStats JSONTotal number of transcript segments
totalDurationnumberSummary + StatsTotal transcript duration in seconds

Use Cases for AI & LLM Pipelines

This actor is designed with AI consumption as a first-class use case:

  • Token-efficient fields — No raw HTML, debug hashes, or redundant metadata. Every field serves a purpose, keeping your token budget low.
  • Structured timestampsstartTime and duration in seconds let you align transcript text with video frames for multimodal AI applications.
  • Deterministic output — Same video, same language, same JSON structure every time. No stochastic HTML parsing — the YouTube Transcript API provides consistent, typed data.
  • Language fallback chain — Request "en", get "en-GB" if that's what the video has. Request any language, and the actor automatically finds regional variants, then falls back to the first available transcript. Never silently fail.
  • Error-resilient execution — Missing captions, disabled transcripts, and unavailable videos all produce clean error rows instead of crashing the run. Your pipeline never hangs.

Why Choose This Actor?

FeatureThis ActorTypical YouTube Transcript Actors
Error handlingNever crashes — pushes error rowThrows exception, run fails
Language fallbackAuto-finds regional variantsStrict language match or crash
Output structureMinimal, token-efficient fieldsBloated with metadata
LLM-readyStructured timestamps in secondsInconsistent formatting
Browser automationNone — HTTP-only, fast & cheapOften uses Playwright/Puppeteer
API key requiredNoSometimes requires YouTube Data API key

Pricing

This actor uses pay-per-event pricing at $3 per 1,000 videos processed (approximately $0.003 per video). There are no monthly commitments, no API key costs, and no hidden infrastructure fees. You only pay for successful transcript extractions.

Price Comparison

ActorPrice per 1,000 videos
This actor$3.00
starvibe/youtube-video-transcript$5.00
pintostudio/youtube-transcript-scraper$10.00

Frequently Asked Questions

How do I download SRT subtitles from YouTube?

Run this actor with any YouTube video URL. It automatically generates a downloadable .srt subtitle file with precise HH:MM:SS,mmm timestamps. The SRT file is stored in the key-value store and a direct download link is included in every dataset row. Compatible with VLC, Premiere Pro, DaVinci Resolve, and YouTube Studio.

Does this actor work for any YouTube video?

It works for any public YouTube video that has captions or subtitles enabled. Videos without any caption track (including auto-generated captions) return a clean error row. Private, unlisted, and age-restricted videos are not supported.

What languages are supported?

All languages that YouTube provides captions for. The actor accepts any ISO 639-1 language code ("en", "fr", "de", "ja", "es", "pt", etc.). If the exact code isn't available, it automatically falls back to regional variants (e.g. "en""en-GB") and then to any available transcript as a last resort.

How do I extract YouTube transcripts for AI and LLM pipelines?

The actor outputs clean, typed JSON with segmentIndex, startTime, duration, and text fields — no raw HTML, no debug hashes, no redundant metadata. Call the actor via the Apify API from Python or JavaScript and pipe the dataset directly into LangChain, LlamaIndex, CrewAI, or your custom RAG pipeline. Each field is designed to minimize token consumption.

How is this different from pintostudio/youtube-transcript-scraper or starvibe/youtube-video-transcript?

Those are excellent actors with large user bases. This actor differentiates on four dimensions: (1) SRT file generation — automatically generates downloadable SubRip subtitle files, not just JSON; (2) error resilience — never crashes on missing transcripts, pushing an error row instead; (3) language intelligence — automatically finds regional and available transcript variants instead of hard-failing on a strict language code; (4) proxy auto-fallback — automatically retries with Apify Proxy and Chrome TLS impersonation when YouTube blocks direct requests.

Is there a free tier or trial?

You can try the actor directly in Apify Console with the default input URL. Usage is billed per 1,000 videos at the published pay-per-event rate. Apify free plan includes $5 prepaid usage credit.

Technical Specifications

SpecValue
RuntimePython 3.14
Memory1,024 MB
Average run time1-15 seconds
Max timeout3,600 seconds
BrowserNone (pure HTTP)
TLS fingerprintChrome 131 impersonation
Proxy fallbackApify Proxy AUTO pool
Output formatsJSON dataset + SRT subtitles

Support and Feedback

Found a bug or have a feature request? Reach out via the Apify Store Issues tab.