YouTube Transcript Scraper
Pricing
from $5.00 / 1,000 transcript extracteds
YouTube Transcript Scraper
Extract YouTube video transcripts with timestamps, metadata, and LLM-ready text from individual URLs, bulk lists, channels, or search queries. Returns transcript segments, plain text, SRT, token count, and LLM context. Supports 50+ languages. 100% success rate on videos with English captions.
Pricing
from $5.00 / 1,000 transcript extracteds
Rating
0.0
(0)
Developer
Khadin Akbar
Actor stats
0
Bookmarked
1
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share
What does YouTube Transcript Scraper do?
YouTube Transcript Scraper is the most versatile YouTube transcript tool on Apify — it handles individual videos, bulk URL lists, full channel archives, AND YouTube search queries all in a single actor. Built for AI pipelines, content researchers, SEO professionals, and developers who need structured transcript data at scale.
100% success rate on videos with English captions. Tested on 20 diverse videos including music videos, TED talks, coding tutorials, and full-length courses.
No coding required — just enter your video URLs, channel handle, or search query and get results in minutes. Returns 27 data fields including timestamped transcript segments, plain text, SRT subtitles, LLM-ready context strings, and token count estimates.
Compatible with: Apify MCP Server (Claude, ChatGPT, Cursor), LangChain, Make.com, Zapier, n8n, and direct API access.
How to extract YouTube video transcripts
- Click "Try for free" to open the Actor in Apify Console
- Choose your input mode:
- Video URLs — paste one or hundreds of YouTube video links
- Channel URL — enter a channel URL or @handle to get transcripts from all their videos
- Search Query — enter a keyword and get transcripts from top search results
- Set the maximum number of videos to process
- Optionally select a transcript language (50+ languages supported with smart fallback)
- Click "Start" and wait for results
- Download your data as JSON, CSV, or Excel
Using with AI agents (Claude, ChatGPT, Cursor)
Connect via the Apify MCP Server and ask naturally:
"Get me the transcript of this YouTube video: https://youtube.com/watch?v=..."
"Extract transcripts from the last 20 videos on the @mkbhd channel"
"Find transcripts of videos about GPT-5 reviews"
The AI agent will automatically select and run this actor, returning structured results with the llmContext field ready for direct prompt injection.
What data can you extract from YouTube transcripts?
| Field | Type | Description |
|---|---|---|
videoId | string | Unique 11-character YouTube video ID |
videoUrl | string | Full YouTube watch URL |
title | string | Video title |
channelName | string | Channel name |
channelId | string | Channel ID (UC...) |
publishedAt | string | Upload date (ISO 8601) |
durationSeconds | number | Video length in seconds |
viewCount | number | Total views |
likeCount | number | Total likes |
commentCount | number | Total comments |
thumbnail | string | Highest-res thumbnail URL |
tags | string[] | Video SEO tags |
transcriptSegments | array | Timestamped segments [{text, start, duration}] |
transcriptText | string | Full concatenated plain text |
transcriptSrt | string | SRT subtitle format |
llmContext | string | Pre-formatted for AI/LLM prompt injection |
tokenEstimate | number | Approximate GPT-4 token count |
wordCount | number | Total words in transcript |
languageUsed | string | ISO 639-1 code of transcript returned |
isAutoGenerated | boolean | Auto-generated vs manual captions |
availableLanguages | string[] | All caption languages available |
inputMode | string | How this video was found (url/channel/search) |
status | string | success / no_transcript / private_video / error |
How much does YouTube Transcript Scraper cost?
$0.004 per transcript — that's 250 transcripts for just $1.00.
| Volume | Cost | Price per transcript |
|---|---|---|
| 100 transcripts | $0.40 | $0.004 |
| 1,000 transcripts | $4.00 | $0.004 |
| 10,000 transcripts | $40.00 | $0.004 |
No startup fees. No hidden charges. You only pay for successfully extracted transcripts. Failed or unavailable videos are not charged. Residential proxy data costs are billed separately by Apify.
Cheapest on Apify Store — competitors charge $0.005–$0.010 per transcript. We're 50-60% cheaper with the same or better success rate.
Free Apify trial includes $5 in platform credits — enough for ~1,250 transcripts to test with.
YouTube transcript extraction use cases
-
AI Pipelines & RAG Systems: The
llmContextfield is pre-formatted for direct injection into LLM prompts. ThetokenEstimatefield helps manage context window limits. Feed transcripts into ChatGPT, Claude, or any RAG pipeline via MCP. -
Content Repurposing: Turn video content into blog posts, social media threads, newsletters, or ebooks. Extract from an entire channel and repurpose at scale.
-
SEO & Content Research: Analyze competitor YouTube channels, extract keyword-rich transcripts for SEO content, and discover trending topics via search mode.
-
Academic Research: Access and analyze spoken content from lectures, interviews, and presentations. Batch-process entire channels or search by topic.
-
Subtitle Generation: The
transcriptSrtfield returns a ready-to-use SRT subtitle file. Import directly into video editors, YouTube Studio, or caption services. -
Market Research & Brand Monitoring: Search for transcripts mentioning your brand, competitors, or industry keywords. Analyze sentiment and messaging at scale.
-
Training Data: Build datasets for AI/ML model training from YouTube's massive spoken content library. Filter by language, topic, and date range.
YouTube Transcript Scraper API & Integration
REST API
curl -X POST "https://api.apify.com/v2/acts/khadinakbar~youtube-transcript-extractor/runs" \-H "Authorization: Bearer YOUR_TOKEN" \-H "Content-Type: application/json" \-d '{"videoUrls": ["https://www.youtube.com/watch?v=dQw4w9WgXcQ"], "maxResults": 5}'
JavaScript / TypeScript
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_TOKEN' });const run = await client.actor('khadinakbar/youtube-transcript-extractor').call({searchQuery: 'how to use ChatGPT for business',maxResults: 10,language: 'en',outputFormat: 'all',});const { items } = await client.dataset(run.defaultDatasetId).listItems();// Each item has llmContext ready for AI pipelinesfor (const item of items) {console.log(item.title, '|', item.tokenEstimate, 'tokens');console.log(item.llmContext); // Inject directly into your LLM prompt}
Python
from apify_client import ApifyClientclient = ApifyClient('YOUR_TOKEN')run = client.actor('khadinakbar/youtube-transcript-extractor').call(run_input={'channelUrl': 'https://www.youtube.com/@mkbhd','maxResults': 20,'language': 'en',})items = list(client.dataset(run['defaultDatasetId']).iterate_items())for item in items:print(f"{item['title']} — {item['wordCount']} words, {item['tokenEstimate']} tokens")
Integrations: Apify MCP Server, LangChain, Make.com, Zapier, n8n, Google Sheets, Slack, Webhooks
FAQ — YouTube Transcript Scraping
Q: What YouTube URL formats are supported?
A: All standard formats: youtube.com/watch?v=, youtu.be/, youtube.com/shorts/, youtube.com/embed/, and youtube.com/v/. For channels: youtube.com/@handle, youtube.com/channel/UCxxxx, and youtube.com/c/name.
Q: What if a video has no transcript?
A: The actor returns a structured response with status: "no_transcript" and lists availableLanguages (which will be empty). Your pipeline can handle this gracefully without crashing.
Q: How does language fallback work?
A: The actor tries your requested language first, then falls back to English, then to any manually uploaded caption, and finally to auto-generated captions. The languageUsed field tells you exactly which language was returned.
Q: Can I use this with Claude or ChatGPT?
A: Yes! Connect via the Apify MCP Server and ask for transcripts in natural language. The AI agent will automatically run this actor and receive the llmContext field formatted for direct use.
Q: How fast is it? A: Approximately 5-10 transcripts per minute for individual URLs, depending on video length and network conditions. Channel and search modes include a discovery step before transcript extraction.
Q: Is this legal? A: This actor only extracts publicly available transcript/caption data that anyone can access through a web browser. See Apify's guide on web scraping legality.
Q: Can I schedule recurring runs? A: Yes — use Apify's built-in scheduler to run daily, weekly, or at any custom interval. Results can be sent to webhooks, email, or integrated apps.
Q: What's the difference between transcriptText, transcriptSrt, and llmContext?
A: transcriptText is pure plain text (no timestamps). transcriptSrt is formatted as an SRT subtitle file. llmContext combines the video title, channel, date, views, and full transcript into a single string optimized for LLM prompt injection.