Pricing

$7.50 / 1,000 transcripts

YouTube Transcript Scraper - JSON, SRT, VTT, RAG

Extract transcripts from YouTube videos. Input: video URLs or IDs + language preferences. Output: plain text, timestamped segments, SRT/VTT subtitles, and RAG chunks with deep links. 100+ languages, no API key. $0.0075 per delivered transcript.

Pricing

$7.50 / 1,000 transcripts

Rating

0.0

(0)

Developer

Jaime Martinez

Actor stats

Bookmarked

Total users

Monthly active users

8 days ago

Last modified

YouTube Transcript Scraper & API for developers and AI pipelines

YouTube transcript scraper and transcript API for developers and AI pipelines — extract transcripts and subtitles as JSON, text, SRT, VTT, or RAG chunks with timestamps, in bulk, across 100+ languages, with no API key.

YouTube gates its caption endpoint behind PoToken bot-checks and datacenter-IP bans, so free libraries like youtube-transcript-api and yt-dlp often return empty transcripts in production. This actor mints a per-video proof-of-origin token, escalates to residential proxies when needed, and is actively maintained — and you only pay for transcripts actually delivered.

Built for developers and AI builders who need YouTube captions at scale: feed an LLM, build a RAG pipeline, summarize videos, translate, search, or repurpose content.

⚡ Quick start

Paste this into the Actor's Input (JSON view) and hit Start:

{
  "videoUrls": ["https://www.youtube.com/watch?v=jNQXAC9IVRw"],
  "languages": ["en"],
  "includePlainText": true
}

Each result row looks like:

{
  "ok": true,
  "videoId": "jNQXAC9IVRw",
  "url": "https://www.youtube.com/watch?v=jNQXAC9IVRw",
  "title": "Me at the zoo",
  "channelName": "jawed",
  "language": "en",
  "captionKind": "manual",
  "segmentCount": 6,
  "text": "All right, so here we are, in front of the elephants..."
}

Or run it from the API:

POST https://api.apify.com/v2/acts/jamhimself~youtube-transcript-extractor/run-sync-get-dataset-items?token=YOUR_APIFY_TOKEN

Turn on includeSrt, includeVtt, or ragChunking to also get subtitle files and embedding-ready chunks in the same run.

Why this actor

✅ Built for YouTube's PoToken era. YouTube now blocks naive transcript scrapers (empty responses behind its bot-check). This actor solves the current proof-of-origin requirement and is actively maintained with a daily canary watchdog — 100% success across our monitored canary runs since launch (2026-06-12).
⚡ Bulk & fast. Pass one URL or thousands; concurrent extraction with automatic retries.
🎯 Every format you need. Plain text, timestamped segments, SRT, VTT, and RAG chunks with deep links — in a single run.
🌍 100+ languages, manual and auto-generated captions, with language preference and fallback.
💸 Pay per video. No monthly fee. You're only charged for transcripts we actually deliver.

Use cases

RAG / LLM ingestion — turn long videos into clean, chunked, timestamped context for vector databases (LangChain, LlamaIndex).
Summaries & notes — feed transcripts to Claude/GPT for summaries, chapters, show notes.
Subtitles — export ready-to-use .srt / .vtt files.
Search & analytics — index spoken content across a channel.
Translation & repurposing — transcribe once, translate anywhere.

Input

Field	Type	Description
`videoUrls`	array	Required. YouTube video URLs (`watch`, `youtu.be`, `shorts`, `embed`, `live`) or raw 11-char video IDs.
`languages`	array	Preferred language codes in order, e.g. `["en","es"]`. Falls back to English, then first available. Default `["en"]`.
`preferManual`	boolean	Prefer human-uploaded captions over auto-generated. Default `true`.
`includeSegments`	boolean	Include timestamped `{start, duration, text}` segments. Default `true`.
`includePlainText`	boolean	Include the whole transcript as one clean string. Default `true`.
`includeSrt`	boolean	Include a SubRip `.srt` string. Default `false`.
`includeVtt`	boolean	Include a WebVTT `.vtt` string. Default `false`.
`ragChunking`	boolean	Emit overlapping chunks with timestamps + deep links for embeddings. Default `false`.
`chunkMaxChars` / `chunkOverlapChars`	integer	Chunk sizing for RAG. Defaults 1500 / 200.
`concurrency`	integer	Videos processed in parallel (1–10). Default 5.
`proxyCountryCode`	string	Optional two-letter country code for proxies (affects region-locked captions).

Example input

{
  "videoUrls": [
    "https://www.youtube.com/watch?v=jNQXAC9IVRw",
    "https://youtu.be/8S0FDjFBj8o"
  ],
  "languages": ["en"],
  "includeSrt": true,
  "ragChunking": true
}

Output

One dataset item per video:

{
  "ok": true,
  "videoId": "jNQXAC9IVRw",
  "url": "https://www.youtube.com/watch?v=jNQXAC9IVRw",
  "title": "Me at the zoo",
  "channelName": "jawed",
  "durationSeconds": 19,
  "viewCount": 358000000,
  "language": "en",
  "captionKind": "manual",
  "availableLanguages": [{ "languageCode": "en", "kind": "manual", "name": "English" }],
  "segmentCount": 6,
  "text": "All right, so here we are, in front of the elephants...",
  "segments": [{ "start": 1.2, "duration": 2.16, "text": "All right, so here we are, in front of the elephants" }],
  "srt": "1\n00:00:01,200 --> 00:00:03,360\nAll right, so here we are...",
  "chunks": [{ "index": 0, "text": "...", "startSeconds": 1.2, "endSeconds": 18.5, "deepLink": "https://www.youtube.com/watch?v=jNQXAC9IVRw&t=1s" }]
}

The dataset ships with three views: Transcripts (one row per video), RAG chunks (one row per chunk, with timestamps and deep links), and Subtitles (SRT/VTT).

Successful transcripts are written to the default dataset — that's the only thing you're billed for. Videos that couldn't be delivered (no captions, private/unavailable, bad URL) are listed with the reason in the run's key-value store under SKIPPED, and a per-run SUMMARY record counts successes, no-caption videos, and failures — never charged.

What you'll pay

$0.0075 per delivered transcript. No subscription, no minimums — skipped videos are never billed.

Transcripts delivered	You pay
1	1 × $0.0075 = $0.0075
100	100 × $0.0075 = $0.75
1,000	1,000 × $0.0075 = $7.50

The Pricing tab always shows the current rate.

Use as an MCP tool / with AI agents

This actor is exposed as an MCP tool via Apify's MCP server (streamable HTTP):

https://mcp.apify.com/?tools=jamhimself/youtube-transcript-extractor

Example MCP client configuration:

{
  "mcpServers": {
    "youtube-transcripts": {
      "url": "https://mcp.apify.com/?tools=jamhimself/youtube-transcript-extractor"
    }
  }
}

Any MCP client — Claude, Cursor, or agent frameworks — can call this actor directly with the same input and output described above (authenticate with your Apify API token).

Why scrapers return empty transcripts (and why this one doesn't)

YouTube's caption (timedtext) endpoint now requires a proof-of-origin token (PoToken) minted by its BotGuard bot-check and bound to the specific video — without it, the endpoint returns an empty 200 response. That, plus aggressive datacenter-IP blocking, is why many free libraries and stale scrapers suddenly return no transcript at all.

This actor:

solves the BotGuard challenge and mints a fresh PoToken bound to each video ID,
tries a cheap datacenter IP first, then automatically escalates to rotating residential proxies,
is actively maintained with a daily canary watchdog — 100% success across our monitored canary runs since launch (2026-06-12).

If a video still returns no transcript, it's almost always genuine: captions disabled, private/members-only, region-locked, or a live stream without a caption track. Those are skipped and never billed.

FAQ

Does it work for auto-generated captions? Yes — manual and ASR (auto) captions, in 100+ languages.

Playlists and channels? Pass individual video URLs for now. Playlist/channel expansion is coming.

Is this legal? It accesses publicly available caption data. You are responsible for complying with YouTube's Terms of Service and applicable copyright law in your use of the output.

Why do some videos return no transcript? The video genuinely has captions disabled, is private/age-restricted/region-locked, or is a live stream without a caption track.

If this actor saved you time, an honest review helps others find it.

Questions or a format you need? Open an issue on the actor — it's actively maintained.

YouTube is a trademark of Google LLC. This actor is not affiliated with or endorsed by them. Built and maintained by Jamhimself LLC.

TikTok Transcript Scraper - JSON, SRT, VTT

jamhimself/tiktok-transcript-scraper

Extracts TikTok video transcripts from native captions (no AI transcription). Input: video URLs or IDs. Output: timestamped JSON segments, plain text, SRT, VTT, or RAG chunks + metadata. $0.003 per video with a transcript; no-caption videos free.

Jaime Martinez

YouTube Channel Transcripts - Full Channel Extractor (RAG)

jamhimself/youtube-channel-transcripts

Extracts the transcript of every video in a YouTube channel. Input: @handle, channel URL, or UC id + max videos. Output: one row per video — full text, timestamped RAG chunks with deep links, optional SRT/VTT. No API key. $0.0075/transcript.

Jaime Martinez

YouTube Transcript Scraper — Batch + SRT/VTT Export

vanity_arias/youtube-transcript-scraper-batch

Extract YouTube video transcripts in bulk — paste video URLs, IDs, or Shorts links and get clean text, timestamped segments, and ready-to-use SRT/VTT subtitle files. No API key, failed videos never charged.

Nvikelo Nyathi

YouTube Transcript Scraper

shanks0x0/youtube-transcript-scraper

Extracts full transcripts and metadata from YouTube videos. Supports single videos, channels, and playlists — returns timestamped segments, plain text, SRT, or VTT with video title, channel name, duration, and language info. No API key or proxy needed.

Meherab Hossain

YouTube Transcript Scraper

taroyamada/youtube-transcript-bulk-api

Extract YouTube captions, timestamps, SRT, VTT, and plain text from public videos in bulk without browser automation.

naoki anzai

YouTube Transcript Scraper – JSON, SRT, VTT, Plain Text

scraperhive/youtube-transcript-scraper

Extract YouTube video transcripts, subtitles, and captions in multiple formats with precise timestamps. Plain Text · JSON · SRT · WebVTT · 20+ Languages · Batch Processing · Auto + Manual Captions

Mubeen Ali

5.0

YouTube Transcript API + Summary & RAG

dr_amp/youtube-transcript-intelligence-api

Extract YouTube captions with timestamps, SRT/VTT, summaries, chapters, action items, citations, and RAG chunks. No API key or external LLM required.

Diego Jurado Garcia

YouTube Transcript API

glassventures/youtube-transcript-api

Extract transcripts, captions, and subtitles from YouTube videos. Supports 100+ languages, auto-generated captions, SRT/VTT export, playlists, and channels.

Glass Ventures

YouTube Transcript Scraper

agilevendor/youtube-transcript-scraper

Extract transcripts and subtitles from any public YouTube video, playlist, or channel. Get plain text, timestamped segments, and ready-to-use SRT and VTT files in one run — plus title, channel, and language. Bulk playlists and channels, language selection. You only pay for successful transcripts.

Agilevendor

YouTube Transcript Scraper – Download Subtitles & Captions

harshmaur/youtube-transcript-scraper

Extract transcripts, captions & subtitles from YouTube videos, channels or playlists — no API key. Timestamped or plain text, SRT/VTT export, 156-language translation, plus full video & channel metadata. Built for AI summaries, ChatGPT & research. Pay only for transcripts returned.