YouTube Transcript Scraper avatar

YouTube Transcript Scraper

Pricing

$50.00 / 1,000 results

Go to Apify Store
YouTube Transcript Scraper

YouTube Transcript Scraper

Extract YouTube video transcripts in multiple languages. Get time-stamped or text-only subtitles.

Pricing

$50.00 / 1,000 results

Rating

5.0

(2)

Developer

Futurize Rush

Futurize Rush

Maintained by Community

Actor stats

2

Bookmarked

45

Total users

3

Monthly active users

2 days ago

Last modified

Categories

Share

Extract transcripts and subtitles from YouTube videos in multiple languages. Get time-stamped or text-only output with support for auto-generated captions and automatic translation.

What does YouTube Transcript Scraper do?

YouTube Transcript Scraper extracts subtitle and caption data from YouTube videos. It retrieves transcripts in all available languages, including auto-generated and manually uploaded subtitles.

Use it to:

  • Extract video transcripts for content analysis, SEO, or accessibility purposes
  • Get subtitles in multiple languages from a single video
  • Process large batches of videos efficiently
  • Obtain time-stamped segments or plain text output

Input

ParameterTypeDefaultDescription
video_idsArrayRequiredYouTube video IDs or full URLs
languagesArrayAll availableSpecific language codes to fetch (e.g., en, zh-TW, es)
text_onlyBooleanfalseReturn plain text without timestamps
include_generatedBooleantrueInclude auto-generated (machine-created) transcripts
include_translationBooleantrueAuto-translate when requested language is unavailable
fetch_allBooleanfalseFetch all available languages regardless of language setting

Example Input

{
"video_ids": [
"dQw4w9WgXcQ",
"https://www.youtube.com/watch?v=jNQXAC9IVRw"
],
"languages": ["en", "es"],
"text_only": false,
"include_generated": true
}

Supported Language Codes

YouTube uses ISO 639-1 language codes. Common examples:

CodeLanguageCodeLanguage
enEnglishjaJapanese
esSpanishkoKorean
frFrenchzh-CNChinese (Simplified)
deGermanzh-TWChinese (Traditional)
pt-BRPortuguese (Brazil)ruRussian

Full language code reference: ISO 639-1 codes

Output

Each video produces one dataset item with the following fields:

FieldTypeDescription
video_idString11-character YouTube video ID
video_urlStringFull YouTube URL
transcriptsObjectTranscript data keyed by language code
languages_countNumberNumber of successfully extracted languages
successful_languagesArrayLanguage codes that were extracted
failed_languagesArrayLanguage codes that failed
text_onlyBooleanWhether text-only mode was used
metadataObjectExtraction timestamp and proxy status

Time-Stamped Output Example

{
"video_id": "dQw4w9WgXcQ",
"video_url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"transcripts": {
"en": {
"language": "en",
"language_name": "English",
"is_generated": false,
"is_translatable": true,
"transcript": [
{
"index": 0,
"text": "Never gonna give you up",
"start": "00:00.000",
"end": "00:03.120",
"start_seconds": 0.0,
"end_seconds": 3.12,
"duration": 3.12,
"word_count": 5,
"char_count": 23
}
],
"stats": {
"total_segments": 87,
"total_words": 435,
"total_characters": 2340,
"average_segment_duration": 3.2,
"total_duration": 278.4
}
}
},
"languages_count": 1,
"successful_languages": ["en"],
"failed_languages": [],
"text_only": false,
"metadata": {
"fetched_at": "2026-02-27T12:34:56.789+00:00",
"proxy_used": true
}
}

Text-Only Output Example

{
"video_id": "dQw4w9WgXcQ",
"video_url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"transcripts": {
"en": {
"language": "en",
"language_name": "English",
"is_generated": false,
"is_translatable": true,
"transcript": [
"Never gonna give you up",
"Never gonna let you down",
"Never gonna run around and desert you"
],
"stats": {
"total_segments": 87,
"total_words": 435,
"total_characters": 2340,
"average_segment_duration": 0.0,
"total_duration": 0.0
}
}
},
"languages_count": 1,
"successful_languages": ["en"],
"failed_languages": [],
"text_only": true
}

Error Output Example

When a video has no subtitles or is unavailable:

{
"video_id": "INVALID_ID",
"video_url": "https://www.youtube.com/watch?v=INVALID_ID",
"error": "Video unavailable",
"code": "VIDEO_UNAVAILABLE"
}

Use Cases

  • Content Analysis - Extract transcripts for sentiment analysis, topic modeling, or keyword extraction
  • Multi-Language Research - Compare transcripts across languages for translation quality analysis
  • Accessibility - Create text versions of video content for hearing-impaired users
  • SEO and Marketing - Repurpose video content into blog posts, articles, or social media
  • Education - Extract lecture transcripts for study notes or course materials
  • Data Collection - Build datasets for natural language processing or machine learning

Tips

  • Enable Include Auto-Generated Transcripts for videos without manual subtitles (most videos have auto-generated English captions)
  • Use Fetch All Available Languages when you need comprehensive multilingual coverage
  • For large batches (50+ videos), the scraper automatically manages rate limiting and proxy rotation
  • Videos without any subtitles will return an error result but will still be included in the output
  • Run statistics are saved in the Key-Value Store under the SUMMARY key

Frequently Asked Questions

Q: Why do some videos return errors? A: Videos may have subtitles disabled by the uploader, be private, age-restricted, or region-blocked. The scraper reports the specific reason for each failure.

Q: What is the difference between auto-generated and manual transcripts? A: Manual transcripts are uploaded by the video creator and tend to be more accurate. Auto-generated transcripts are created by YouTube's speech recognition and may contain errors, especially for specialized terminology.

Q: Can I get transcripts for live streams? A: Only if the live stream has been completed and YouTube has processed the auto-generated or manual captions.

Q: How many videos can I process in one run? A: There is no hard limit, but larger batches take longer due to rate limiting. The scraper handles batches of any size with built-in delays to avoid being blocked.


YouTube Transcript Scraper is built for reliability with automatic retry, proxy rotation, and intelligent rate limiting. Compatible with Claude Code, Gemini, Codex, OpenClaw and other AI coding agents.