
Youtube Subtitles Scraper
Pricing
$2.00/month + usage
Go to Apify Store

Youtube Subtitles Scraper
Scrape YouTube captions and subtitles fast. Export transcripts in JSON or CSV for SEO, research, and content analysis.
0.0 (0)
Pricing
$2.00/month + usage
0
2
2
Last modified
6 days ago
YouTube Captions & Transcripts Scraper (Apify Actor)
Extract clean, timestamped YouTube subtitles at scale. Perfect for creators, researchers, SEO teams, accessibility workflows, and AI/NLP pipelines that need accurate multi‑language transcripts fast.
Quick start
- Open the Actor on Apify and paste one or many YouTube URLs.
- Select Language mode: “Get all available” or “Get by code” (e.g.,
en
,es
). - (Optional) Keep auto‑generated subtitles enabled for broader coverage.
- Run the Actor. When finished, export the dataset as JSON or CSV.
- Use the results for summaries, translations, search, or content repurposing.
What it does
- Bulk‑extracts captions/transcripts from any public YouTube videos
- Multi‑language: fetch all available languages or only the ones you choose
- Keeps timestamps and ordering for easy quoting and analysis
- Handles auto‑generated subtitles when official ones aren’t available
- Designed for reliability with smart retries and proxy rotation on errors
- Outputs structured data you can export as JSON, CSV, or use directly in apps
Who it’s for
- Content creators and editors repurposing videos into articles, posts, or newsletters
- Researchers and journalists collecting quotes and evidence from video sources
- SEO and marketing teams mining topics, entities, and keywords from video content
- Accessibility and compliance teams generating readable caption text
- AI & data teams feeding high‑quality text into summarization, RAG, or LLM training
Popular use cases
- Summarize long videos into notes, briefs, or articles
- Create highlight reels and pull timestamped quotes for social posts
- Translate and localize scripts to reach global audiences
- Index channel archives for topic discovery and search
- Build clean training datasets for speech and language models
Simple workflow
- Add one or many YouTube URLs (bulk friendly)
- Choose language preference:
- Get all available languages, or
- Specify language codes (e.g.,
en
,es
,de
)
- Run and download results from the dataset as JSON/CSV
That’s it — no coding required.
Input options (plain English)
- Video URLs: paste one or more links. You can bulk‑edit the list.
- Language mode:
- Get by code — provide one or more language codes.
- Get all available — collect every transcript language offered by the video.
- Language codes (when using “Get by code”): examples
en
,en, es
. - Fetch auto‑generated subtitles: include YouTube’s auto captions when official ones aren’t present.
- Max retries: how many times to retry a transcript fetch if a request temporarily fails.
- Proxy settings: uses Apify Proxy under the hood. On network/proxy errors, the actor rotates to a fresh proxy to keep your run moving.
Notes
- You can add/remove URLs one‑by‑one or paste a whole list.
- URLs are deduplicated while preserving order.
Output at a glance
For each video, the actor saves one dataset item with:
- video_id and original url
- supported_languages: a list of language codes found
- transcripts: a mapping of language code → transcript entries (with timestamps)
You can export the dataset to JSON/CSV directly in Apify.
Why this tool
- Scale: process dozens or thousands of videos in one run
- Accuracy: preserves timestamps and structure for downstream analysis
- Coverage: fetch every available language or narrow to what you need
- Reliability: on‑error proxy rotation and retries reduce flaky failures
- Ready for AI: clean text that plugs straight into summarization, search, and RAG
Tips & best practices
- Exploring a channel? Start with “Get all available” languages to see coverage, then narrow to specific codes.
- Keep auto‑generated subtitles enabled to maximize results on videos without official captions.
- For lighter outputs, specify just the languages you need.
- Always respect platform terms, privacy, and content rights.
FAQ
- Does it work on private or members‑only videos?
- No. It works with publicly available videos that offer captions/transcripts.
- Does it bypass paywalls or protections?
- No. It only collects data that YouTube exposes for public content.
- Are timestamps preserved?
- Yes. Each transcript entry includes start times so you can quote precisely.
- Can I run this in bulk?
- Yes. Paste a list of URLs and the actor will process them in order.