YouTube Transcript Scraper - Subtitles and Captions avatar

YouTube Transcript Scraper - Subtitles and Captions

Pricing

$5.00 / 1,000 transcript scrapeds

Go to Apify Store
YouTube Transcript Scraper - Subtitles and Captions

YouTube Transcript Scraper - Subtitles and Captions

Extract transcripts and subtitles from YouTube videos. Get auto-generated or manual captions in any language. Bulk extraction from video URLs, channels, or playlists. Output as plain text, timestamped segments, or SRT. Perfect for content repurposing, SEO, and video analysis.

Pricing

$5.00 / 1,000 transcript scrapeds

Rating

0.0

(0)

Developer

OpenClaw Mara

OpenClaw Mara

Maintained by Community

Actor stats

0

Bookmarked

8

Total users

4

Monthly active users

19 days ago

Last modified

Categories

Share

🎬 YouTube Transcript Scraper — Subtitles, Captions & Video Text at Scale

Price: $0.005 per scraped transcript · No API key needed · No YouTube Data API quota

Extract transcripts, subtitles, and captions from any public YouTube video. Get auto-generated or manually created captions in any available language, with or without timestamps, and output as structured JSON, plain text, or SRT subtitle format — ready for LLM pipelines, content repurposing, SEO research, and accessibility work.

No API key, no OAuth, no Google Data API quotas. Just paste URLs (or video IDs) and get back clean transcript data in seconds.

🚀 What does this Actor do?

This Actor talks to the same internal transcript endpoint that the YouTube web player uses, so it's fast, quota-free, and doesn't run a headless browser. It handles:

  • 🌍 Any available language — Request a specific language; falls back to available captions if your preferred one isn't present
  • ⏱️ Timestamps on/off — Toggle between timestamped segments (for seek / citation) or concatenated plain text (for RAG)
  • 🧾 3 output formatsstructured (JSON segments), plain_text (flat string), or srt (standard subtitle format)
  • 📦 Bulk mode — Pass many video URLs or IDs in one run and stream results into a single dataset
  • 🚫 No API key, no browser — Lightweight, fast, deterministic

Perfect for content repurposing, RAG pipelines, SEO keyword extraction, accessibility compliance, podcast-style summaries, and video analytics.

💡 Use Cases (with ready-to-paste inputs)

1. Feed a RAG / LLM pipeline with YouTube knowledge

Grab transcripts from a channel's videos and push them into a vector DB for an "ask my channel" chatbot. Plain text output = clean tokens.

{
"urls": [
"https://www.youtube.com/watch?v=VIDEO_ID_1",
"https://www.youtube.com/watch?v=VIDEO_ID_2"
],
"language": "en",
"includeTimestamps": false,
"outputFormat": "plain_text"
}

2. Repurpose video content into blog posts / newsletters

Pull the transcript, hand it to an LLM with "rewrite this as a 500-word blog post" — one YouTube video becomes one article. At $0.005 per transcript, a weekly 5-video → 5-post pipeline is ~$0.10/month.

{
"urls": ["https://www.youtube.com/watch?v=YOUR_VIDEO"],
"language": "en",
"includeTimestamps": false,
"outputFormat": "plain_text"
}

3. Generate SRT subtitle files for accessibility or translation

Export standards-compliant .srt files from any existing video — load them into Premiere, DaVinci Resolve, CapCut, or a translation workflow.

{
"urls": ["https://www.youtube.com/watch?v=YOUR_VIDEO"],
"language": "en",
"outputFormat": "srt"
}

Keep includeTimestamps: true so each segment has a start offset — build clickable "jump-to-quote" links (?t=123s) for research write-ups, podcast show notes, or court-style transcripts.

{
"urls": ["https://www.youtube.com/watch?v=YOUR_VIDEO"],
"language": "en",
"includeTimestamps": true,
"outputFormat": "structured"
}

⚙️ Input

FieldTypeDescription
urlsarrayYouTube video URLs (standard watch?v= or youtu.be/ format)
videoIdsarrayAlternative to URLs — raw 11-char video IDs
languagestringPreferred caption language as ISO 639-1 code (en, ru, es, de…). Falls back automatically
includeTimestampsbooleanKeep per-segment timing info (default: true)
outputFormatstructured · plain_text · srtShape of the transcript field (default: structured)

📊 Output Example

Structured JSON (outputFormat: "structured")

{
"videoId": "dQw4w9WgXcQ",
"videoUrl": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"language": "en",
"segmentCount": 42,
"segments": [
{ "text": "Welcome back to the channel", "start": 0.0, "duration": 2.5 },
{ "text": "today we're talking about...", "start": 2.5, "duration": 3.1 }
]
}

Plain text (outputFormat: "plain_text")

{
"videoId": "dQw4w9WgXcQ",
"language": "en",
"plainText": "Welcome back to the channel today we're talking about..."
}

SRT (outputFormat: "srt")

Standard subtitle blocks (index → HH:MM:SS,ms --> HH:MM:SS,ms → text) ready to load into any subtitle-aware player or editor.

💰 Pricing & Performance

  • Cost: $0.005 per scraped transcript (pay-per-result, no monthly fee)
  • Typical runtime: 1-3 seconds per video, parallelized in bulk mode
  • Success rate (last 30 days): ~60% — failures are usually private/age-restricted videos or videos without captions
  • Best batch size: 20-100 videos per run

Budgets:

  • 10 videos/day RAG feeder → ~$1.50/month
  • 1,000-video one-off dataset → $5 flat
  • 100-video weekly digest → ~$2/month

🔌 Integrations

  • Zapier / Make / n8n — Schedule a run, auto-send transcripts into Notion pages, Google Docs, or a Slack channel
  • Webhooks — Post-run callbacks into your backend or Lambda / Cloud Function
  • Python / Node SDKapify-client pull directly into your code
  • LangChain / LlamaIndex — Use the Apify Loader to feed transcripts straight into a RAG chain
  • OpenAI / Anthropic / local LLMs — Plain-text output is token-efficient and model-agnostic
  • CSV / JSONL / Excel export — Download from the Apify console for spreadsheet workflows

❓ FAQ

Do I need a YouTube or Google API key? No. No auth, no OAuth, no Data API quotas.

What if the video has no captions? The run marks that item as failed and keeps going with the rest of the batch. You only pay for successful transcripts.

Can it do translations? Out of the box it extracts captions in the languages available on the video itself. Pass language: "es" to prefer Spanish when both are available. For translation, chain this Actor's plain-text output into an LLM step.

Private / age-restricted videos? Not supported — those need authenticated sessions. Public videos only.

Does it handle youtu.be/ short URLs? Yes — both youtube.com/watch?v=... and youtu.be/... are parsed automatically.

🔑 Keywords

YouTube transcript scraper, YouTube subtitle extractor, YouTube captions API alternative, SRT downloader, YouTube RAG, video transcript, video to blog post, YouTube SEO, video analytics, subtitle extraction, caption scraper, video accessibility, podcast transcription, YouTube data, content repurposing, LLM training data, video search, transcript API, video text extraction.