YouTube Transcript Scraper - JSON, SRT, VTT, RAG avatar

YouTube Transcript Scraper - JSON, SRT, VTT, RAG

Pricing

$7.50 / 1,000 transcripts

Go to Apify Store
YouTube Transcript Scraper - JSON, SRT, VTT, RAG

YouTube Transcript Scraper - JSON, SRT, VTT, RAG

Extract YouTube transcripts & subtitles as JSON, text, SRT, VTT, or RAG chunks - bulk, 100+ languages, timestamps & deep links. Pay per video, no subscription.

Pricing

$7.50 / 1,000 transcripts

Rating

0.0

(0)

Developer

Jaime Martinez

Jaime Martinez

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a day ago

Last modified

Share

YouTube Transcript API for developers and AI pipelines

YouTube Transcript API for developers and AI pipelines — extract transcripts and subtitles as JSON, text, SRT, VTT, or RAG chunks with timestamps, in bulk, across 100+ languages, with no API key.

YouTube blocks naive scrapers with PoToken and datacenter-IP bans, so free libraries like youtube-transcript-api and yt-dlp break in production — this actor's hosted residential-proxy + uptime layer succeeds where they fail, and you only pay for transcripts actually delivered.

Built for developers and AI builders who need YouTube captions at scale: feed an LLM, build a RAG pipeline, summarize videos, translate, search, or repurpose content.


Why this actor

  • Actually works in 2026. YouTube now blocks naive transcript scrapers (empty responses behind their bot-check). This actor solves YouTube's current proof-of-origin requirement, so it keeps returning real captions where older tools return nothing. Health-monitored and fixed fast when YouTube changes.
  • Bulk & fast. Pass one URL or thousands; concurrent extraction with automatic retries.
  • 🎯 Every format you need. Plain text, timestamped segments, SRT, VTT, and RAG chunks with deep links — in a single run.
  • 🌍 100+ languages, manual and auto-generated captions, with language preference and fallback.
  • 💸 Pay per video. No monthly fee. You're only charged for transcripts we actually deliver.

Use cases

  • RAG / LLM ingestion — turn long videos into clean, chunked, timestamped context for vector databases (LangChain, LlamaIndex).
  • Summaries & notes — feed transcripts to Claude/GPT for summaries, chapters, show notes.
  • Subtitles — export ready-to-use .srt / .vtt files.
  • Search & analytics — index spoken content across a channel.
  • Translation & repurposing — transcribe once, translate anywhere.

Input

FieldTypeDescription
videoUrlsarrayRequired. YouTube video URLs (watch, youtu.be, shorts, embed, live) or raw 11-char video IDs.
languagesarrayPreferred language codes in order, e.g. ["en","es"]. Falls back to English, then first available. Default ["en"].
preferManualbooleanPrefer human-uploaded captions over auto-generated. Default true.
includeSegmentsbooleanInclude timestamped {start, duration, text} segments. Default true.
includePlainTextbooleanInclude the whole transcript as one clean string. Default true.
includeSrtbooleanInclude a SubRip .srt string. Default false.
includeVttbooleanInclude a WebVTT .vtt string. Default false.
ragChunkingbooleanEmit overlapping chunks with timestamps + deep links for embeddings. Default false.
chunkMaxChars / chunkOverlapCharsintegerChunk sizing for RAG. Defaults 1500 / 200.
concurrencyintegerVideos processed in parallel (1–10). Default 5.

Example input

{
"videoUrls": [
"https://www.youtube.com/watch?v=jNQXAC9IVRw",
"https://youtu.be/8S0FDjFBj8o"
],
"languages": ["en"],
"includeSrt": true,
"ragChunking": true
}

Output

One dataset item per video:

{
"ok": true,
"videoId": "jNQXAC9IVRw",
"url": "https://www.youtube.com/watch?v=jNQXAC9IVRw",
"title": "Me at the zoo",
"channelName": "jawed",
"durationSeconds": 19,
"viewCount": 358000000,
"language": "en",
"captionKind": "manual",
"availableLanguages": [{ "languageCode": "en", "kind": "manual", "name": "English" }],
"segmentCount": 6,
"text": "All right, so here we are, in front of the elephants...",
"segments": [{ "start": 1.2, "duration": 2.16, "text": "All right, so here we are, in front of the elephants" }],
"srt": "1\n00:00:01,200 --> 00:00:03,360\nAll right, so here we are...",
"chunks": [{ "index": 0, "text": "...", "startSeconds": 1.2, "endSeconds": 18.5, "deepLink": "https://www.youtube.com/watch?v=jNQXAC9IVRw&t=1s" }]
}

Successful transcripts are written to the default dataset. Videos with no captions, or that are private/unavailable/invalid, are written to a separate skipped dataset (each with the reason) — and are never charged. You only pay for transcripts actually delivered.

Pricing

You're billed only per successfully delivered transcript. Videos with no captions, or that are private/unavailable, are skipped for free — no subscription, no minimums. See the Pricing tab for the current per-transcript rate.

FAQ

Does it work for auto-generated captions? Yes — manual and ASR (auto) captions, in 100+ languages.

Playlists and channels? Pass individual video URLs for now. Playlist/channel expansion is coming.

Is this legal? It accesses publicly available caption data. You are responsible for complying with YouTube's Terms of Service and applicable copyright law in your use of the output.

Why do some videos return no transcript? The video genuinely has captions disabled, is private/age-restricted/region-locked, or is a live stream without a caption track.


Questions or a format you need? Open an issue on the actor — it's actively maintained.