YouTube Transcript Scraper
Pricing
$50.00 / 1,000 results
YouTube Transcript Scraper
Extract YouTube video transcripts in multiple languages. Get time-stamped or text-only subtitles.
Pricing
$50.00 / 1,000 results
Rating
5.0
(2)
Developer

Futurize Rush
Actor stats
2
Bookmarked
45
Total users
3
Monthly active users
2 days ago
Last modified
Share
Extract transcripts and subtitles from YouTube videos in multiple languages. Get time-stamped or text-only output with support for auto-generated captions and automatic translation.
What does YouTube Transcript Scraper do?
YouTube Transcript Scraper extracts subtitle and caption data from YouTube videos. It retrieves transcripts in all available languages, including auto-generated and manually uploaded subtitles.
Use it to:
- Extract video transcripts for content analysis, SEO, or accessibility purposes
- Get subtitles in multiple languages from a single video
- Process large batches of videos efficiently
- Obtain time-stamped segments or plain text output
Input
| Parameter | Type | Default | Description |
|---|---|---|---|
video_ids | Array | Required | YouTube video IDs or full URLs |
languages | Array | All available | Specific language codes to fetch (e.g., en, zh-TW, es) |
text_only | Boolean | false | Return plain text without timestamps |
include_generated | Boolean | true | Include auto-generated (machine-created) transcripts |
include_translation | Boolean | true | Auto-translate when requested language is unavailable |
fetch_all | Boolean | false | Fetch all available languages regardless of language setting |
Example Input
{"video_ids": ["dQw4w9WgXcQ","https://www.youtube.com/watch?v=jNQXAC9IVRw"],"languages": ["en", "es"],"text_only": false,"include_generated": true}
Supported Language Codes
YouTube uses ISO 639-1 language codes. Common examples:
| Code | Language | Code | Language |
|---|---|---|---|
en | English | ja | Japanese |
es | Spanish | ko | Korean |
fr | French | zh-CN | Chinese (Simplified) |
de | German | zh-TW | Chinese (Traditional) |
pt-BR | Portuguese (Brazil) | ru | Russian |
Full language code reference: ISO 639-1 codes
Output
Each video produces one dataset item with the following fields:
| Field | Type | Description |
|---|---|---|
video_id | String | 11-character YouTube video ID |
video_url | String | Full YouTube URL |
transcripts | Object | Transcript data keyed by language code |
languages_count | Number | Number of successfully extracted languages |
successful_languages | Array | Language codes that were extracted |
failed_languages | Array | Language codes that failed |
text_only | Boolean | Whether text-only mode was used |
metadata | Object | Extraction timestamp and proxy status |
Time-Stamped Output Example
{"video_id": "dQw4w9WgXcQ","video_url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ","transcripts": {"en": {"language": "en","language_name": "English","is_generated": false,"is_translatable": true,"transcript": [{"index": 0,"text": "Never gonna give you up","start": "00:00.000","end": "00:03.120","start_seconds": 0.0,"end_seconds": 3.12,"duration": 3.12,"word_count": 5,"char_count": 23}],"stats": {"total_segments": 87,"total_words": 435,"total_characters": 2340,"average_segment_duration": 3.2,"total_duration": 278.4}}},"languages_count": 1,"successful_languages": ["en"],"failed_languages": [],"text_only": false,"metadata": {"fetched_at": "2026-02-27T12:34:56.789+00:00","proxy_used": true}}
Text-Only Output Example
{"video_id": "dQw4w9WgXcQ","video_url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ","transcripts": {"en": {"language": "en","language_name": "English","is_generated": false,"is_translatable": true,"transcript": ["Never gonna give you up","Never gonna let you down","Never gonna run around and desert you"],"stats": {"total_segments": 87,"total_words": 435,"total_characters": 2340,"average_segment_duration": 0.0,"total_duration": 0.0}}},"languages_count": 1,"successful_languages": ["en"],"failed_languages": [],"text_only": true}
Error Output Example
When a video has no subtitles or is unavailable:
{"video_id": "INVALID_ID","video_url": "https://www.youtube.com/watch?v=INVALID_ID","error": "Video unavailable","code": "VIDEO_UNAVAILABLE"}
Use Cases
- Content Analysis - Extract transcripts for sentiment analysis, topic modeling, or keyword extraction
- Multi-Language Research - Compare transcripts across languages for translation quality analysis
- Accessibility - Create text versions of video content for hearing-impaired users
- SEO and Marketing - Repurpose video content into blog posts, articles, or social media
- Education - Extract lecture transcripts for study notes or course materials
- Data Collection - Build datasets for natural language processing or machine learning
Tips
- Enable Include Auto-Generated Transcripts for videos without manual subtitles (most videos have auto-generated English captions)
- Use Fetch All Available Languages when you need comprehensive multilingual coverage
- For large batches (50+ videos), the scraper automatically manages rate limiting and proxy rotation
- Videos without any subtitles will return an error result but will still be included in the output
- Run statistics are saved in the Key-Value Store under the
SUMMARYkey
Frequently Asked Questions
Q: Why do some videos return errors? A: Videos may have subtitles disabled by the uploader, be private, age-restricted, or region-blocked. The scraper reports the specific reason for each failure.
Q: What is the difference between auto-generated and manual transcripts? A: Manual transcripts are uploaded by the video creator and tend to be more accurate. Auto-generated transcripts are created by YouTube's speech recognition and may contain errors, especially for specialized terminology.
Q: Can I get transcripts for live streams? A: Only if the live stream has been completed and YouTube has processed the auto-generated or manual captions.
Q: How many videos can I process in one run? A: There is no hard limit, but larger batches take longer due to rate limiting. The scraper handles batches of any size with built-in delays to avoid being blocked.
YouTube Transcript Scraper is built for reliability with automatic retry, proxy rotation, and intelligent rate limiting. Compatible with Claude Code, Gemini, Codex, OpenClaw and other AI coding agents.