Youtube Subtitles Scraper avatar
Youtube Subtitles Scraper

Pricing

$2.00/month + usage

Go to Apify Store
Youtube Subtitles Scraper

Youtube Subtitles Scraper

Developed by

El

El

Maintained by Community

Scrape YouTube captions and subtitles fast. Export transcripts in JSON or CSV for SEO, research, and content analysis.

0.0 (0)

Pricing

$2.00/month + usage

0

2

2

Last modified

6 days ago

YouTube Captions & Transcripts Scraper (Apify Actor)

Extract clean, timestamped YouTube subtitles at scale. Perfect for creators, researchers, SEO teams, accessibility workflows, and AI/NLP pipelines that need accurate multi‑language transcripts fast.

Quick start

  • Open the Actor on Apify and paste one or many YouTube URLs.
  • Select Language mode: “Get all available” or “Get by code” (e.g., en, es).
  • (Optional) Keep auto‑generated subtitles enabled for broader coverage.
  • Run the Actor. When finished, export the dataset as JSON or CSV.
  • Use the results for summaries, translations, search, or content repurposing.

What it does

  • Bulk‑extracts captions/transcripts from any public YouTube videos
  • Multi‑language: fetch all available languages or only the ones you choose
  • Keeps timestamps and ordering for easy quoting and analysis
  • Handles auto‑generated subtitles when official ones aren’t available
  • Designed for reliability with smart retries and proxy rotation on errors
  • Outputs structured data you can export as JSON, CSV, or use directly in apps

Who it’s for

  • Content creators and editors repurposing videos into articles, posts, or newsletters
  • Researchers and journalists collecting quotes and evidence from video sources
  • SEO and marketing teams mining topics, entities, and keywords from video content
  • Accessibility and compliance teams generating readable caption text
  • AI & data teams feeding high‑quality text into summarization, RAG, or LLM training
  • Summarize long videos into notes, briefs, or articles
  • Create highlight reels and pull timestamped quotes for social posts
  • Translate and localize scripts to reach global audiences
  • Index channel archives for topic discovery and search
  • Build clean training datasets for speech and language models

Simple workflow

  1. Add one or many YouTube URLs (bulk friendly)
  2. Choose language preference:
    • Get all available languages, or
    • Specify language codes (e.g., en, es, de)
  3. Run and download results from the dataset as JSON/CSV

That’s it — no coding required.


Input options (plain English)

  • Video URLs: paste one or more links. You can bulk‑edit the list.
  • Language mode:
    • Get by code — provide one or more language codes.
    • Get all available — collect every transcript language offered by the video.
  • Language codes (when using “Get by code”): examples en, en, es.
  • Fetch auto‑generated subtitles: include YouTube’s auto captions when official ones aren’t present.
  • Max retries: how many times to retry a transcript fetch if a request temporarily fails.
  • Proxy settings: uses Apify Proxy under the hood. On network/proxy errors, the actor rotates to a fresh proxy to keep your run moving.

Notes

  • You can add/remove URLs one‑by‑one or paste a whole list.
  • URLs are deduplicated while preserving order.

Output at a glance

For each video, the actor saves one dataset item with:

  • video_id and original url
  • supported_languages: a list of language codes found
  • transcripts: a mapping of language code → transcript entries (with timestamps)

You can export the dataset to JSON/CSV directly in Apify.


Why this tool

  • Scale: process dozens or thousands of videos in one run
  • Accuracy: preserves timestamps and structure for downstream analysis
  • Coverage: fetch every available language or narrow to what you need
  • Reliability: on‑error proxy rotation and retries reduce flaky failures
  • Ready for AI: clean text that plugs straight into summarization, search, and RAG

Tips & best practices

  • Exploring a channel? Start with “Get all available” languages to see coverage, then narrow to specific codes.
  • Keep auto‑generated subtitles enabled to maximize results on videos without official captions.
  • For lighter outputs, specify just the languages you need.
  • Always respect platform terms, privacy, and content rights.

FAQ

  • Does it work on private or members‑only videos?
    • No. It works with publicly available videos that offer captions/transcripts.
  • Does it bypass paywalls or protections?
    • No. It only collects data that YouTube exposes for public content.
  • Are timestamps preserved?
    • Yes. Each transcript entry includes start times so you can quote precisely.
  • Can I run this in bulk?
    • Yes. Paste a list of URLs and the actor will process them in order.