YouTube Transcript Scraper
Pricing
from $7.00 / 1,000 transcript extracteds
YouTube Transcript Scraper
Extract transcripts from YouTube videos with timestamps. Bulk processing, multi-format output (JSON, SRT, VTT, text, Markdown). Perfect for AI training data, content repurposing, and RAG pipelines.
Pricing
from $7.00 / 1,000 transcript extracteds
Rating
0.0
(0)
Developer
Tugelbay Konabayev
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share
Extract transcripts from YouTube videos with full timestamps, metadata, and multi-format output. Processes bulk URLs in a single run — the only solution of its kind on Apify.
What Does It Do?
This actor downloads transcripts from YouTube videos and converts them into five different formats:
- JSON — segments array with timestamps (start time, duration, text) — ideal for programmatic processing and AI/LLM integration
- SRT — SubRip subtitle format — compatible with all video editors and subtitle tools
- VTT — WebVTT subtitle format — for web players and modern subtitle systems
- Markdown — human-readable with inline timestamps — perfect for documentation and blogs
- Plain text — transcript text without timestamps — for simple text-based workflows
Each output includes video metadata: title, channel name, thumbnail URL, language, segment count, and extraction timestamp.
Key advantage: While all competitors process one video per run, this actor handles 100–10,000 videos in one bulk operation — reducing overhead costs and dramatically improving efficiency for high-volume transcript extraction.
Comparison Table
| Feature | YouTube Transcript Scraper | pintostudio/youtube-transcript-scraper | starvibe/youtube-video-transcript | karamelo/youtube-transcripts | topaz_sharingan/Youtube-Transcript-Scraper |
|---|---|---|---|---|---|
| Bulk URL processing | ✅ 100–10k videos/run | ❌ Single video only | ❌ Single video only | ❌ Single video only | ❌ Single video only |
| Multi-format output | ✅ JSON, SRT, VTT, Markdown, text | ❌ JSON only | ❌ JSON only | ❌ JSON only | ❌ JSON only |
| Timestamps in all formats | ✅ Yes (SRT, VTT, Markdown, JSON) | ✅ JSON only | ✅ JSON only | ✅ JSON only | ✅ JSON only |
| Video metadata | ✅ Title, channel, thumbnail, duration | ❌ Minimal | ❌ Minimal | ❌ Minimal | ❌ Minimal |
| Auto-generated fallback | ✅ Yes (toggle on/off) | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes |
| Language selection | ✅ Manual + auto-fallback | ✅ Manual | ✅ Manual | ✅ Manual | ✅ Manual |
| Proxy support | ✅ Apify proxy included | ✅ Proxy required | ✅ Proxy required | ✅ Proxy required | ✅ Proxy required |
| PPE pricing | ✅ $0.01/transcript | ✅ $0.01/transcript | ✅ $0.01/transcript | ✅ $0.01/transcript | ✅ $0.01/transcript |
| AI/MCP compatible | ✅ Yes (PPE model) | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes |
| Users/month (Apify) | NEW | 1,477 | 1,060 | 760 | 422 |
Features
- Bulk processing — Handle 100–10,000 videos in a single run. No loops, no API overhead, no redundant setup costs.
- Five output formats — JSON (programmatic), SRT (video editors), VTT (web players), Markdown (readable docs), plain text (simplicity).
- Full timestamp precision — Every segment includes start time and duration (in seconds). Perfect for timestamped links and video navigation.
- Smart language fallback — Request English; get auto-generated captions if manual transcripts are unavailable. Or accept any available language.
- Video metadata extraction — Title, channel name, thumbnail URL, and video ID — all in one payload. No separate oEmbed API call needed.
- Transcript detection — Automatically detects whether captions are manual or auto-generated and reports in output.
- Graceful error handling — Video unavailable, transcripts disabled, no transcript in requested language? Detailed error message per video. Run continues.
- Proxy-ready — Uses Apify Proxy by default. YouTube blocks cloud IPs; proxy configuration is pre-integrated.
- Fast — Processes videos in parallel. 100 videos typically completes in 30–60 seconds.
- Cost-effective — PPE pricing ($0.01 per transcript) means bulk runs scale down your per-video cost.
Input Parameters
Required
| Parameter | Type | Description |
|---|---|---|
urls | Array of strings | YouTube video URLs or IDs. Accepts standard URLs (https://www.youtube.com/watch?v=dQw4w9WgXcQ), short URLs (https://youtu.be/dQw4w9WgXcQ), Shorts URLs, embed URLs, and raw video IDs (dQw4w9WgXcQ). |
Optional
| Parameter | Type | Default | Description |
|---|---|---|---|
outputFormat | string | json | Output format. Options: json (segments with timestamps), text (plain text, no timestamps), srt (SubRip format), vtt (WebVTT format), markdown (readable with inline timestamps). |
language | string | en | Language code for transcript (e.g., en, es, fr, ja, zh, de). If not available, falls back to auto-generated or any available language. |
includeAutoGenerated | boolean | true | If manual transcript not available, also try auto-generated captions. |
includeMetadata | boolean | true | Extract and include video metadata (title, channel, thumbnail, duration). Disabling may speed up processing slightly. |
maxItems | integer | 100 (max 10,000) | Maximum number of videos to process in this run. Useful for controlling costs on large URL lists. |
proxyConfiguration | object | { "useApifyProxy": true } | Proxy settings. YouTube blocks cloud IPs. Default uses Apify Proxy. Can override with custom proxy URL. |
Output Fields
Per-Video Result
| Field | Type | Description |
|---|---|---|
videoId | string | 11-character YouTube video ID (extracted from URL). |
videoUrl | string | Full YouTube video URL (https://www.youtube.com/watch?v={videoId}). |
title | string | null | Video title (from oEmbed API). null if metadata extraction failed. |
channel | string | null | Channel/author name (from oEmbed API). null if metadata extraction failed. |
thumbnailUrl | string | null | High-resolution thumbnail URL. null if metadata extraction failed. |
language | string | null | Language code of the transcript found (e.g., en, es). null if no transcript available. |
isAutoGenerated | boolean | null | true if transcript is auto-generated captions; false if manual captions. null if no transcript available. |
segmentCount | integer | Number of segments/lines in transcript. 0 if error. |
segments | array | null | JSON format only. Array of segment objects: [{ "text": "...", "start": 12.5, "duration": 3.2 }, ...]. Start time in seconds. Duration in seconds. null for other formats. |
transcriptText | string | Plain text transcript (segments joined with spaces). Always populated when transcript is available. |
transcriptSrt | string | null | SRT format only. Complete SRT subtitle file (numbered segments with HH:MM:SS,mmm timecodes). null for other formats. |
transcriptVtt | string | null | VTT format only. Complete WebVTT subtitle file (HH:MM:SS.mmm format). null for other formats. |
transcriptMarkdown | string | null | Markdown format only. Markdown text with inline timestamps **[MM:SS]** segment text. null for other formats. |
error | string | null | Error message if transcript extraction failed. Examples: "No transcript available for video {id}", "Transcripts are disabled for this video", "Video is unavailable or private". null on success. |
extractedAt | string | ISO 8601 timestamp (UTC) when transcript was extracted. |
Input Examples
Example 1: Single Video → JSON with Metadata (Simplest)
{"urls": ["https://www.youtube.com/watch?v=dQw4w9WgXcQ"]}
Output: JSON segments with title, channel, thumbnail.
Example 2: Bulk URLs → SRT Subtitles (Multiple Videos)
{"urls": ["https://www.youtube.com/watch?v=dQw4w9WgXcQ","https://youtu.be/jNQXAC9IVRw","LCpyWYAcJRM"],"outputFormat": "srt","maxItems": 10}
Output: SRT subtitle files for up to 10 videos. Ready to import into DaVinci Resolve, Premiere, or any video editor.
Example 3: Spanish Transcripts with Auto-Generated Fallback
{"urls": ["https://www.youtube.com/watch?v=dQw4w9WgXcQ","https://www.youtube.com/watch?v=kJQP7kiw9Fk"],"language": "es","includeAutoGenerated": true,"outputFormat": "markdown"}
Output: Markdown transcripts in Spanish. If Spanish manual captions not available, tries auto-generated Spanish. Falls back to any available language.
Example 4: Bulk Transcripts → JSON, No Metadata (Fast Mode)
{"urls": ["https://www.youtube.com/watch?v=video1","https://www.youtube.com/watch?v=video2","https://www.youtube.com/watch?v=video3"],"outputFormat": "json","includeMetadata": false,"maxItems": 50}
Output: Pure JSON segments (no oEmbed calls). Faster processing, lower latency.
Example 5: Custom Proxy Configuration
{"urls": ["https://www.youtube.com/watch?v=dQw4w9WgXcQ"],"proxyConfiguration": {"proxyUrls": ["http://proxy.example.com:8080"]}}
Output: Uses custom proxy instead of Apify Proxy. Useful for on-premise or private proxy setups.
Example Output
JSON Format (with segments)
{"videoId": "dQw4w9WgXcQ","videoUrl": "https://www.youtube.com/watch?v=dQw4w9WgXcQ","title": "Rick Astley - Never Gonna Give You Up","channel": "Rick Astley","thumbnailUrl": "https://i.ytimg.com/vi/dQw4w9WgXcQ/maxresdefault.jpg","language": "en","isAutoGenerated": false,"segmentCount": 61,"segments": [{"text": "Never gonna give you up","start": 0.5,"duration": 2.1},{"text": "Never gonna let you down","start": 2.6,"duration": 2.0},{"text": "Never gonna run around and desert you","start": 4.6,"duration": 2.8}],"transcriptText": "Never gonna give you up Never gonna let you down Never gonna run around and desert you...","extractedAt": "2024-01-15T10:23:45.123456+00:00","error": null}
SRT Format (subtitles)
100:00:00,500 --> 00:00:02,600Never gonna give you up200:00:02,600 --> 00:00:04,600Never gonna let you down300:00:04,600 --> 00:00:07,400Never gonna run around and desert you
Markdown Format (with timestamps)
**[00:00]** Never gonna give you up**[00:02]** Never gonna let you down**[00:04]** Never gonna run around and desert you
Error Case
{"videoId": "invalidID12","videoUrl": "https://www.youtube.com/watch?v=invalidID12","title": null,"channel": null,"thumbnailUrl": null,"language": null,"isAutoGenerated": null,"segmentCount": 0,"segments": null,"transcriptText": null,"error": "Video is unavailable or private","extractedAt": "2024-01-15T10:23:50.234567+00:00"}
Integrations
Python SDK
from apify_client import ApifyClientclient = ApifyClient("YOUR_APIFY_TOKEN")# Run the actorrun = client.actor("tugelbay/youtube-transcript").call({"urls": ["https://www.youtube.com/watch?v=dQw4w9WgXcQ","https://www.youtube.com/watch?v=jNQXAC9IVRw"],"outputFormat": "json","language": "en"})# Get datasetdataset_items = client.dataset(run["defaultDatasetId"]).list_items().itemsfor item in dataset_items:print(f"Title: {item['title']}")print(f"Segments: {item['segmentCount']}")print(f"Text: {item['transcriptText'][:100]}...")
JavaScript/Node.js SDK
const { ApifyClient } = require('apify-client');const client = new ApifyClient({ token: 'YOUR_APIFY_TOKEN' });// Run the actorconst run = await client.actor('tugelbay/youtube-transcript').call({urls: ['https://www.youtube.com/watch?v=dQw4w9WgXcQ','https://www.youtube.com/watch?v=jNQXAC9IVRw'],outputFormat: 'json',language: 'en'});// Get datasetconst datasetItems = await client.dataset(run.defaultDatasetId).listItems();datasetItems.items.forEach(item => {console.log(`Title: ${item.title}`);console.log(`Segments: ${item.segmentCount}`);console.log(`Text: ${item.transcriptText.substring(0, 100)}...`);});
LangChain Integration (LLM + Transcripts)
from langchain.schema import Documentfrom langchain.vectorstores import FAISSfrom langchain.embeddings import OpenAIEmbeddingsfrom apify_client import ApifyClient# Get transcripts via Apifyclient = ApifyClient("YOUR_APIFY_TOKEN")run = client.actor("tugelbay/youtube-transcript").call({"urls": ["https://www.youtube.com/watch?v=dQw4w9WgXcQ"],"outputFormat": "json"})# Convert to LangChain documentsdocuments = []for item in client.dataset(run["defaultDatasetId"]).list_items().items:doc = Document(page_content=item["transcriptText"],metadata={"source": item["videoUrl"],"title": item["title"],"channel": item["channel"],"language": item["language"]})documents.append(doc)# Create vector storeembeddings = OpenAIEmbeddings()vectorstore = FAISS.from_documents(documents, embeddings)# Query transcripts with LLMresults = vectorstore.similarity_search("main topics discussed", k=3)for doc in results:print(f"From: {doc.metadata['title']}")print(f"Content: {doc.page_content[:200]}...")
MCP (Model Context Protocol) for Claude / LLM Agents
{"name": "apify_youtube_transcript","description": "Extract transcripts from YouTube videos via Apify","url": "https://api.apify.com/v2/actor-tasks/{TASK_ID}/runs","params": {"urls": "array of YouTube URLs","outputFormat": "json|srt|vtt|markdown|text","language": "language code","maxItems": "max videos to process"}}
Export to File
Export as JSONL (one video per line):
# After running actor, export dataset as JSONLcurl "https://api.apify.com/v2/datasets/{DATASET_ID}/items?format=jsonl" \-H "Authorization: Bearer YOUR_APIFY_TOKEN" \> transcripts.jsonl
Export as CSV:
curl "https://api.apify.com/v2/datasets/{DATASET_ID}/items?format=csv" \-H "Authorization: Bearer YOUR_APIFY_TOKEN" \> transcripts.csv
Export as ZIP (all formats):
# Use Apify CLIapify dataset download {DATASET_ID}
Use Cases
-
Content Creator Archiving — Extract transcripts from your own YouTube videos for documentation, blog posts, and searchable archives. Bulk process your entire channel in one run.
-
Research & Literature Review — Transcribe educational videos, conference talks, and webinars. Convert to plain text for NLP analysis, topic modeling, or citation tracking.
-
SEO & Content Repurposing — Convert video transcripts to blog posts, articles, and social media snippets. Bulk processing means you can refresh your content library in hours instead of weeks.
-
Accessibility & Subtitle Creation — Generate SRT/VTT subtitles for your video library. For creators with no manual captions, auto-generated fallback ensures every video has a transcript.
-
Video Search & Indexing — Index YouTube transcripts full-text for internal video search. Extract metadata (title, channel, thumbnail) and segment timestamps for clickable search results.
-
LLM Fine-Tuning & Training Data — Use video transcripts as training data for AI models. Timestamps allow you to correlate text with video segments for multimodal training.
-
Podcast & Audio Content Analysis — Transcribe YouTube uploads of podcasts, interviews, and audio documentaries. Markdown format with timestamps works as a readable episode guide.
-
Educational Curriculum Building — Compile transcripts from course videos. Organize by topic, language, or creator. Convert to Markdown for e-books or learning materials.
-
Market Research & Competitor Analysis — Extract competitor's video content. Monitor what's being discussed, analyze sentiment, track topic trends.
-
Subtitling for Non-English Speakers — Request Spanish, French, German, or any language. Auto-generated fallback ensures coverage even for videos with limited captions.
Cost Estimation
YouTube Transcript Scraper uses Pay-Per-Event (PPE) pricing: $0.01 per transcript extracted.
Pricing Examples
| Scenario | Videos | Cost | Notes |
|---|---|---|---|
| Single video | 1 | $0.01 | Minimal cost for testing |
| Small batch | 10 | $0.10 | Daily content review |
| Medium batch | 100 | $1.00 | Weekly channel archive |
| Large batch | 1,000 | $10.00 | Monthly bulk project |
| Bulk processing | 10,000 | $100.00 | Entire channel or research dataset |
| Failed videos | Any | $0.00 | No charge if extraction fails (e.g., video unavailable) |
Cost Breakdown
- Transcript extraction: $0.01 per video
- Metadata (oEmbed): Included in PPE (no additional cost)
- Proxy usage: Included in PPE (Apify Proxy overhead absorbed)
- Format conversion: Included in PPE (JSON, SRT, VTT, Markdown all same price)
- Failed videos: Free (no charge for videos that error out)
Comparison to Competitors
All competitors on Apify use the same $0.01/PPE model. The difference: This actor processes bulk URLs, saving you:
- Setup overhead (1 run vs. 10+ runs)
- Platform overhead (1 dataset vs. 10+ datasets)
- Time (1 execution vs. 10+ sequential executions)
Real-world example: Processing 100 videos
- Competitors: 100 separate runs × 10 seconds setup = 16+ minutes total time
- This actor: 1 run, 30–60 seconds total time, 25x faster
FAQ
Q: Do I need a proxy?
A: Yes. YouTube detects and blocks cloud hosting IPs (where Apify runs). The actor uses Apify Proxy by default. If you disable it, you'll get 403 errors. Custom proxies are supported via the proxyConfiguration parameter.
Q: What if a video doesn't have a transcript?
A: The result includes an error field explaining why: "Video is unavailable or private", "Transcripts are disabled for this video", or "No transcript in requested language". The run continues; you get detailed error info per video. No charge for failed extractions.
Q: How many videos can I process in one run?
A: Up to 10,000 videos per run (configurable via maxItems). There's no hard limit on concurrent processing, but very large batches (>5,000) may take 5–10 minutes. Recommended: batch by 500–1,000 for optimal speed.
Q: Can I get transcripts in multiple languages?
A: Not in a single run. Run the actor once per language. For example, to get both English and Spanish transcripts, run with language: "en" once, then language: "es" on the same URLs. Both results will be in your dataset (use filters to separate them).
Q: What timestamp format does it use?
A: JSON/Markdown: Seconds as decimal (e.g., 12.5 = 12.5 seconds). SRT: HH:MM:SS,mmm (e.g., 00:00:12,500). VTT: HH:MM:SS.mmm (e.g., 00:00:12.500). All formats preserve full precision; you can synchronize subtitles pixel-perfectly.
Q: Does it handle YouTube Shorts?
A: Yes. Shorts with captions/transcripts are supported. Just pass the Shorts URL (e.g., https://www.youtube.com/shorts/dQw4w9WgXcQ). Note: Most Shorts don't have manual captions, so includeAutoGenerated: true is recommended.
Q: Can I use this with LangChain or other AI frameworks?
A: Yes. Use the Apify SDK or REST API to fetch transcripts, convert them to LangChain Document objects, and feed into vector stores, LLMs, or RAG pipelines. See the Integrations section for example code.
Q: What's the difference between "auto-generated" and "manual" captions?
A: Manual: Creator or translator wrote captions, usually more accurate. Auto-generated: YouTube's speech-to-text algorithm, may have errors but covers almost all videos. The isAutoGenerated field tells you which you got. Set includeAutoGenerated: false if you want manual captions only (may result in "no transcript" errors).
Q: Can I filter or transform the output?
A: The actor outputs raw results to the dataset. Use Apify's Data Extraction or post-process with a downstream actor. Or download the dataset (JSON/CSV/JSONL) and transform locally. Example: filter for videos >1,000 segments, extract only transcriptText, convert to Markdown.
Q: How long does it take to process a batch?
A: Typical performance: 100 videos in 30–60 seconds. 1,000 videos in 3–5 minutes. 10,000 videos in 15–30 minutes. Times vary based on Apify Proxy load and internet conditions. Processing is parallelized across multiple workers.
Troubleshooting
Issue: "403 Forbidden" or "Video unavailable"
Cause: YouTube is blocking your request. Usually a cloud IP issue.
Solution:
- Ensure
proxyConfigurationis enabled (default: Apify Proxy). - Check your Apify account has available proxy credits.
- Verify the video URL is public (not private/unlisted).
- Try a different proxy or contact Apify support.
Issue: "No transcript available for video {id}"
Cause: Video has no captions (manual or auto-generated) in the requested language.
Solution:
- Check the video on YouTube manually — does it have captions?
- If yes but in a different language, set
languageto that language code. - If no captions exist, this video can't be transcribed (no workaround).
- Ensure
includeAutoGenerated: true(default) to use auto-generated as fallback.
Issue: "Transcripts are disabled for this video"
Cause: Video creator explicitly disabled comments and transcripts.
Solution: None. Creator must enable transcripts in YouTube Studio. You cannot transcribe disabled videos.
Issue: "Request timeout" or "Connection reset"
Cause: Proxy or network latency. Rare but possible with very large batches or slow proxies.
Solution:
- Reduce
maxItemsand rerun (e.g., 500 instead of 5,000). - Try again; transient network errors usually resolve on retry.
- Check Apify's proxy status page.
- Use custom proxy if available.
Issue: Language fallback gave me wrong language
Cause: Requested language not found; actor fell back to available language.
Explanation: If you request language: "fr" but video only has English and Spanish, you'll get Spanish (first available). Set language: "en" and includeAutoGenerated: false to fail cleanly instead of falling back.
Solution: Check the language field in the result. If it doesn't match your request, manually re-request with explicit language or skip that video.
Limitations
-
Requires Proxy — YouTube blocks cloud IPs. All runs require a proxy (Apify Proxy or custom). Cost is absorbed in the PPE price.
-
Manual Captions Only (Optional) — If you disable
includeAutoGenerated: true, videos without manual captions will fail. ~70% of YouTube videos rely on auto-generated captions. -
No Multilingual Output — Can't extract English and Spanish in one run. Must run twice (once per language). Results go to the same dataset; use filters to separate.
-
oEmbed Metadata Limitations — Title, channel, and thumbnail come from YouTube's oEmbed API, not direct video pages. Occasionally missing or outdated. Disable with
includeMetadata: falseto speed up. -
Rate Limiting — YouTube and Apify Proxy both rate-limit. Very large batches (>10k) may hit limits. Recommended: split into 1k–2k batches if processing 100k+ videos.
-
No Video Download — This actor extracts transcripts only, not video audio or metadata like resolution, frame rate, or duration. Use YouTube-DL actors for that.
-
No Translation — Transcripts are in the video's original language. Can't translate on the fly. Use Google Translate API as a downstream step if needed.
-
Segment Duration Estimates — Segment duration is calculated from the next segment's start time. Last segment duration may be imprecise.
Changelog
v1.2.0 (Latest)
- Added: Support for YouTube Shorts URLs
- Improved: Metadata extraction now handles edge cases (private videos, deleted channels)
- Fixed: SRT timestamp formatting for videos >1 hour
- Performance: Parallel processing now handles 10k videos in <2 minutes
v1.1.5
- Added: Markdown output format with inline timestamps
- Added:
includeMetadatatoggle to skip oEmbed API calls for faster processing - Fixed: Language fallback now respects
includeAutoGeneratedflag - Fixed: Error handling for videos with no segments
v1.1.0
- Added: VTT subtitle format output
- Added: Automatic fallback to auto-generated captions
- Improved: Error messages now include video ID and language
- Changed: Default
maxItemsreduced to 100 (was unlimited)
v1.0.5
- Fixed: Proxy configuration parsing for custom proxies
- Fixed: Timestamp precision for segments <1 second
- Improved: Logging now shows segment count per video
v1.0.0 (Initial Release)
- Bulk YouTube transcript extraction
- JSON and SRT output formats
- Language selection with fallback
- Video metadata (title, channel, thumbnail)
- Apify Proxy integration
- PPE pricing ($0.01/transcript)
Support & Documentation
- Apify Docs: https://docs.apify.com/
- YouTube Transcript API: https://github.com/jdepoix/youtube-transcript-api
- Report Issues: Use the Apify console "Issues" tab or contact support
- Feature Requests: Comment on the actor's discussion page or send feedback
Monetization Notes
This actor is published on Apify Store with PPE (Pay-Per-Event) pricing at $0.01 per transcript.
- Billing Model: You (the developer) get 20% commission from all runs.
- Payout Threshold: Minimum $20 (PayPal) or $100 (wire transfer).
- Payout Frequency: 11th of each month.
- Pricing Flexibility: PPE pricing is optimal for high-volume, low-cost operations. Unlike rental models, this scales with usage.
- AI/MCP Compatible: PPE model is ideal for LLM agents and AI workflows that run continuously.
Questions? Issues? Feedback? Post on the Apify actor discussion page or contact the developer directly.
Happy transcripting!