Extract YouTube Transcripts in Seconds (No API Key Needed) avatar

Extract YouTube Transcripts in Seconds (No API Key Needed)

Pricing

from $1.00 / 1,000 results

Go to Apify Store
Extract YouTube Transcripts in Seconds (No API Key Needed)

Extract YouTube Transcripts in Seconds (No API Key Needed)

Scrape YouTube video transcripts reliably using a smart fallback chain: custom transcript URL templates → Invidious captions → optional yt-dlp → optional youtubetranscript.com (last resort). Outputs one transcript record per video with language, source, and error details. No YouTube API required.

Pricing

from $1.00 / 1,000 results

Rating

0.0

(0)

Developer

Inus Grobler

Inus Grobler

Maintained by Community

Actor stats

0

Bookmarked

11

Total users

2

Monthly active users

11 days ago

Last modified

Share

YouTube Transcript Scraper API | Extract Captions & Subtitle Segments (Apify Actor)

Extract YouTube transcripts, captions, and subtitle text segments from video URLs or video IDs. This Apify Actor is built for transcript scraping at scale with resilient fallback sources and Apify Proxy support.

Important Output Behavior

  • Output is always segment/piece based.
  • One input video can produce many dataset items.
  • This increases result volume and can increase pay-per-result usage/costs on Apify.

How It Works

Fallback chain (reliability-first):

  1. optional customTranscriptUrlTemplate (legacy/API use)
  2. Invidious captions
  3. YouTube internal player API
  4. YouTube watch-page caption tracks
  5. youtube-transcript-api
  6. youtubetranscript.com
  7. optional yt-dlp fallback (last resort)
  8. automatic retry pass for transient failures

Input (Simplified)

Use .actor/input_schema.json in the Apify UI.

Primary fields:

  • videoUrls (required)
  • preferredLanguages
  • timeoutSec
  • maxConcurrency
  • maxChars
  • proxyCountry (optional)
  • proxyPoolSize
  • fetchVideoMeta
  • youtubeCookies (optional)
  • enableYtDlpFallback (optional)

Proxy Behavior

  • On Apify, the Actor defaults to Apify Proxy automatically.
  • It tries RESIDENTIAL first (when available), then falls back to account-default groups.
  • Proxy sessions are pre-created and rotated per video using proxyPoolSize.

Output

Dataset rows are transcript pieces, not one row per video.

Common output fields:

  • video_id, url, title
  • status (found or missing)
  • language, source, transcript_url
  • piece_index, piece_count, piece_start, piece_dur
  • text, word_count, transcript_word_count, error

Key-value store:

  • OUTPUT: run metadata + totals + warnings
  • OUTPUT.meta.source_timings: per-source attempts, success/failure, latency stats (avg_ms, p95_ms, max_ms)

Notes

  • Transcript availability depends on caption availability and endpoint accessibility.
  • This Actor does not use the official YouTube Data API.
  • Legacy/advanced fields are still accepted for backward compatibility via API calls.