Youtube Transcript Extractor avatar
Youtube Transcript Extractor

Pricing

$5.00/month + usage

Go to Apify Store
Youtube Transcript Extractor

Youtube Transcript Extractor

Transform YouTube videos into searchable text transcripts effortlessly. This powerful scraper extracts transcripts from any YouTube video or Short without API restrictions. Simple integration with n8n, Make, and other tools. Fast extraction, easy copy/download functionality.

Pricing

$5.00/month + usage

Rating

0.0

(0)

Developer

ius iyb

ius iyb

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Share

YouTube Transcript Extractor (Python)

A Python-based Apify actor that extracts transcripts/subtitles from YouTube videos.

Features

  • Extract transcripts from any YouTube video with available captions
  • Batch processing of multiple videos
  • Language selection with intelligent fallback (manual captions preferred over auto-generated)
  • Built-in proxy support via Apify proxy
  • Retry logic with exponential backoff
  • Reports whether captions are auto-generated or manually created
  • Returns structured data with timestamps, compatible with the JS actor output format

Input

FieldTypeDefaultDescription
urlsarrayrequiredList of YouTube video URLs to extract transcripts from
includeTimestampsbooleantrueInclude formatted timestamps (HH:MM:SS) for each segment
includeFullTextbooleantrueInclude the complete transcript as a single text block
preferredLanguagestring"en"Preferred transcript language code (e.g., en, es, fr). Falls back to any available language if not found.
useProxybooleantrueUse Apify proxy for requests (recommended to avoid rate limiting)
proxyCountrystringnullCountry code for proxy (e.g., US, GB). Leave empty for automatic selection.
maxRetriesinteger3Maximum retry attempts per video (1-10)
requestDelayMsinteger1000Delay between processing videos in milliseconds (0-10000)

Example Input

{
"urls": [
"https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"https://youtu.be/jNQXAC9IVRw"
],
"includeTimestamps": true,
"includeFullText": true,
"preferredLanguage": "en",
"useProxy": true,
"proxyCountry": "US",
"maxRetries": 3,
"requestDelayMs": 1500
}

Output

Each video produces a result object in the default dataset.

Successful Extraction

{
"success": true,
"videoId": "dQw4w9WgXcQ",
"videoUrl": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"extractedAt": "2026-01-30T10:30:00.000000+00:00",
"extractionMethod": "youtube_transcript_api",
"transcriptLanguage": "en",
"isAutoGenerated": false,
"transcript": {
"segments": [
{
"text": "We're no strangers to love",
"startTime": 18,
"duration": 3,
"index": 0,
"startTimeFormatted": "00:18"
}
],
"totalSegments": 42,
"totalDuration": 213,
"totalDurationFormatted": "03:33",
"fullText": "We're no strangers to love..."
}
}

Failed Extraction

{
"success": false,
"videoId": "invalidId",
"videoUrl": "https://www.youtube.com/watch?v=invalidId",
"error": "No transcripts available for this video",
"extractedAt": "2026-01-30T10:30:00.000000+00:00"
}

Supported URL Formats

  • Standard: https://www.youtube.com/watch?v=VIDEO_ID
  • Short: https://youtu.be/VIDEO_ID
  • Embed: https://www.youtube.com/embed/VIDEO_ID
  • Legacy: https://www.youtube.com/v/VIDEO_ID
  • Shorts: https://www.youtube.com/shorts/VIDEO_ID
  • Live: https://www.youtube.com/live/VIDEO_ID
  • Plain video ID: VIDEO_ID (11 characters)

Language Selection

The actor selects transcripts in this priority order:

  1. Manually created transcript in the preferred language
  2. Auto-generated transcript in the preferred language
  3. First available transcript in any language

The output includes transcriptLanguage (the actual language code used) and isAutoGenerated (whether the transcript was auto-generated by YouTube).

Run Info

After each run, a RUN_INFO record is saved to the default key-value store with statistics:

{
"completedAt": "2026-01-30T10:35:00.000000+00:00",
"statistics": {
"total": 5,
"successful": 4,
"failed": 1,
"totalSegments": 523,
"successRate": "80.0%"
},
"failedUrls": [
{ "url": "https://www.youtube.com/watch?v=private123", "error": "..." }
],
"input": {
"urlCount": 5,
"useProxy": true,
"preferredLanguage": "en"
}
}

Notes

  • Proxy recommended: YouTube may block requests without a proxy. Enable useProxy for best results. Residential proxies are attempted first, falling back to datacenter proxies.
  • Rate limiting: The actor includes configurable delays between requests to avoid throttling.
  • Private/age-restricted videos: Videos requiring login cannot be accessed and will return an error.
  • No captions: Some videos don't have captions available and will return an error.
  • Retries: Failed extractions are retried with exponential backoff (1s, 2s, 4s, ... up to 10s).