YouTube Transcript Scraper avatar

YouTube Transcript Scraper

Pricing

from $7.00 / 1,000 transcript extracteds

Go to Apify Store
YouTube Transcript Scraper

YouTube Transcript Scraper

Extract transcripts from YouTube videos with timestamps. Bulk processing, multi-format output (JSON, SRT, VTT, text, Markdown). Perfect for AI training data, content repurposing, and RAG pipelines.

Pricing

from $7.00 / 1,000 transcript extracteds

Rating

0.0

(0)

Developer

Tugelbay Konabayev

Tugelbay Konabayev

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a day ago

Last modified

Share

Extract transcripts from YouTube videos with full timestamps, metadata, and multi-format output. Processes bulk URLs in a single run — the only solution of its kind on Apify.


What Does It Do?

This actor downloads transcripts from YouTube videos and converts them into five different formats:

  1. JSON — segments array with timestamps (start time, duration, text) — ideal for programmatic processing and AI/LLM integration
  2. SRT — SubRip subtitle format — compatible with all video editors and subtitle tools
  3. VTT — WebVTT subtitle format — for web players and modern subtitle systems
  4. Markdown — human-readable with inline timestamps — perfect for documentation and blogs
  5. Plain text — transcript text without timestamps — for simple text-based workflows

Each output includes video metadata: title, channel name, thumbnail URL, language, segment count, and extraction timestamp.

Key advantage: While all competitors process one video per run, this actor handles 100–10,000 videos in one bulk operation — reducing overhead costs and dramatically improving efficiency for high-volume transcript extraction.


Comparison Table

FeatureYouTube Transcript Scraperpintostudio/youtube-transcript-scraperstarvibe/youtube-video-transcriptkaramelo/youtube-transcriptstopaz_sharingan/Youtube-Transcript-Scraper
Bulk URL processing✅ 100–10k videos/run❌ Single video only❌ Single video only❌ Single video only❌ Single video only
Multi-format output✅ JSON, SRT, VTT, Markdown, text❌ JSON only❌ JSON only❌ JSON only❌ JSON only
Timestamps in all formats✅ Yes (SRT, VTT, Markdown, JSON)✅ JSON only✅ JSON only✅ JSON only✅ JSON only
Video metadata✅ Title, channel, thumbnail, duration❌ Minimal❌ Minimal❌ Minimal❌ Minimal
Auto-generated fallback✅ Yes (toggle on/off)✅ Yes✅ Yes✅ Yes✅ Yes
Language selection✅ Manual + auto-fallback✅ Manual✅ Manual✅ Manual✅ Manual
Proxy support✅ Apify proxy included✅ Proxy required✅ Proxy required✅ Proxy required✅ Proxy required
PPE pricing✅ $0.01/transcript✅ $0.01/transcript✅ $0.01/transcript✅ $0.01/transcript✅ $0.01/transcript
AI/MCP compatible✅ Yes (PPE model)✅ Yes✅ Yes✅ Yes✅ Yes
Users/month (Apify)NEW1,4771,060760422

Features

  • Bulk processing — Handle 100–10,000 videos in a single run. No loops, no API overhead, no redundant setup costs.
  • Five output formats — JSON (programmatic), SRT (video editors), VTT (web players), Markdown (readable docs), plain text (simplicity).
  • Full timestamp precision — Every segment includes start time and duration (in seconds). Perfect for timestamped links and video navigation.
  • Smart language fallback — Request English; get auto-generated captions if manual transcripts are unavailable. Or accept any available language.
  • Video metadata extraction — Title, channel name, thumbnail URL, and video ID — all in one payload. No separate oEmbed API call needed.
  • Transcript detection — Automatically detects whether captions are manual or auto-generated and reports in output.
  • Graceful error handling — Video unavailable, transcripts disabled, no transcript in requested language? Detailed error message per video. Run continues.
  • Proxy-ready — Uses Apify Proxy by default. YouTube blocks cloud IPs; proxy configuration is pre-integrated.
  • Fast — Processes videos in parallel. 100 videos typically completes in 30–60 seconds.
  • Cost-effective — PPE pricing ($0.01 per transcript) means bulk runs scale down your per-video cost.

Input Parameters

Required

ParameterTypeDescription
urlsArray of stringsYouTube video URLs or IDs. Accepts standard URLs (https://www.youtube.com/watch?v=dQw4w9WgXcQ), short URLs (https://youtu.be/dQw4w9WgXcQ), Shorts URLs, embed URLs, and raw video IDs (dQw4w9WgXcQ).

Optional

ParameterTypeDefaultDescription
outputFormatstringjsonOutput format. Options: json (segments with timestamps), text (plain text, no timestamps), srt (SubRip format), vtt (WebVTT format), markdown (readable with inline timestamps).
languagestringenLanguage code for transcript (e.g., en, es, fr, ja, zh, de). If not available, falls back to auto-generated or any available language.
includeAutoGeneratedbooleantrueIf manual transcript not available, also try auto-generated captions.
includeMetadatabooleantrueExtract and include video metadata (title, channel, thumbnail, duration). Disabling may speed up processing slightly.
maxItemsinteger100 (max 10,000)Maximum number of videos to process in this run. Useful for controlling costs on large URL lists.
proxyConfigurationobject{ "useApifyProxy": true }Proxy settings. YouTube blocks cloud IPs. Default uses Apify Proxy. Can override with custom proxy URL.

Output Fields

Per-Video Result

FieldTypeDescription
videoIdstring11-character YouTube video ID (extracted from URL).
videoUrlstringFull YouTube video URL (https://www.youtube.com/watch?v={videoId}).
titlestring | nullVideo title (from oEmbed API). null if metadata extraction failed.
channelstring | nullChannel/author name (from oEmbed API). null if metadata extraction failed.
thumbnailUrlstring | nullHigh-resolution thumbnail URL. null if metadata extraction failed.
languagestring | nullLanguage code of the transcript found (e.g., en, es). null if no transcript available.
isAutoGeneratedboolean | nulltrue if transcript is auto-generated captions; false if manual captions. null if no transcript available.
segmentCountintegerNumber of segments/lines in transcript. 0 if error.
segmentsarray | nullJSON format only. Array of segment objects: [{ "text": "...", "start": 12.5, "duration": 3.2 }, ...]. Start time in seconds. Duration in seconds. null for other formats.
transcriptTextstringPlain text transcript (segments joined with spaces). Always populated when transcript is available.
transcriptSrtstring | nullSRT format only. Complete SRT subtitle file (numbered segments with HH:MM:SS,mmm timecodes). null for other formats.
transcriptVttstring | nullVTT format only. Complete WebVTT subtitle file (HH:MM:SS.mmm format). null for other formats.
transcriptMarkdownstring | nullMarkdown format only. Markdown text with inline timestamps **[MM:SS]** segment text. null for other formats.
errorstring | nullError message if transcript extraction failed. Examples: "No transcript available for video {id}", "Transcripts are disabled for this video", "Video is unavailable or private". null on success.
extractedAtstringISO 8601 timestamp (UTC) when transcript was extracted.

Input Examples

Example 1: Single Video → JSON with Metadata (Simplest)

{
"urls": ["https://www.youtube.com/watch?v=dQw4w9WgXcQ"]
}

Output: JSON segments with title, channel, thumbnail.

Example 2: Bulk URLs → SRT Subtitles (Multiple Videos)

{
"urls": [
"https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"https://youtu.be/jNQXAC9IVRw",
"LCpyWYAcJRM"
],
"outputFormat": "srt",
"maxItems": 10
}

Output: SRT subtitle files for up to 10 videos. Ready to import into DaVinci Resolve, Premiere, or any video editor.

Example 3: Spanish Transcripts with Auto-Generated Fallback

{
"urls": [
"https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"https://www.youtube.com/watch?v=kJQP7kiw9Fk"
],
"language": "es",
"includeAutoGenerated": true,
"outputFormat": "markdown"
}

Output: Markdown transcripts in Spanish. If Spanish manual captions not available, tries auto-generated Spanish. Falls back to any available language.

Example 4: Bulk Transcripts → JSON, No Metadata (Fast Mode)

{
"urls": [
"https://www.youtube.com/watch?v=video1",
"https://www.youtube.com/watch?v=video2",
"https://www.youtube.com/watch?v=video3"
],
"outputFormat": "json",
"includeMetadata": false,
"maxItems": 50
}

Output: Pure JSON segments (no oEmbed calls). Faster processing, lower latency.

Example 5: Custom Proxy Configuration

{
"urls": ["https://www.youtube.com/watch?v=dQw4w9WgXcQ"],
"proxyConfiguration": {
"proxyUrls": ["http://proxy.example.com:8080"]
}
}

Output: Uses custom proxy instead of Apify Proxy. Useful for on-premise or private proxy setups.


Example Output

JSON Format (with segments)

{
"videoId": "dQw4w9WgXcQ",
"videoUrl": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"title": "Rick Astley - Never Gonna Give You Up",
"channel": "Rick Astley",
"thumbnailUrl": "https://i.ytimg.com/vi/dQw4w9WgXcQ/maxresdefault.jpg",
"language": "en",
"isAutoGenerated": false,
"segmentCount": 61,
"segments": [
{
"text": "Never gonna give you up",
"start": 0.5,
"duration": 2.1
},
{
"text": "Never gonna let you down",
"start": 2.6,
"duration": 2.0
},
{
"text": "Never gonna run around and desert you",
"start": 4.6,
"duration": 2.8
}
],
"transcriptText": "Never gonna give you up Never gonna let you down Never gonna run around and desert you...",
"extractedAt": "2024-01-15T10:23:45.123456+00:00",
"error": null
}

SRT Format (subtitles)

1
00:00:00,500 --> 00:00:02,600
Never gonna give you up
2
00:00:02,600 --> 00:00:04,600
Never gonna let you down
3
00:00:04,600 --> 00:00:07,400
Never gonna run around and desert you

Markdown Format (with timestamps)

**[00:00]** Never gonna give you up
**[00:02]** Never gonna let you down
**[00:04]** Never gonna run around and desert you

Error Case

{
"videoId": "invalidID12",
"videoUrl": "https://www.youtube.com/watch?v=invalidID12",
"title": null,
"channel": null,
"thumbnailUrl": null,
"language": null,
"isAutoGenerated": null,
"segmentCount": 0,
"segments": null,
"transcriptText": null,
"error": "Video is unavailable or private",
"extractedAt": "2024-01-15T10:23:50.234567+00:00"
}

Integrations

Python SDK

from apify_client import ApifyClient
client = ApifyClient("YOUR_APIFY_TOKEN")
# Run the actor
run = client.actor("tugelbay/youtube-transcript").call(
{
"urls": [
"https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"https://www.youtube.com/watch?v=jNQXAC9IVRw"
],
"outputFormat": "json",
"language": "en"
}
)
# Get dataset
dataset_items = client.dataset(run["defaultDatasetId"]).list_items().items
for item in dataset_items:
print(f"Title: {item['title']}")
print(f"Segments: {item['segmentCount']}")
print(f"Text: {item['transcriptText'][:100]}...")

JavaScript/Node.js SDK

const { ApifyClient } = require('apify-client');
const client = new ApifyClient({ token: 'YOUR_APIFY_TOKEN' });
// Run the actor
const run = await client
.actor('tugelbay/youtube-transcript')
.call({
urls: [
'https://www.youtube.com/watch?v=dQw4w9WgXcQ',
'https://www.youtube.com/watch?v=jNQXAC9IVRw'
],
outputFormat: 'json',
language: 'en'
});
// Get dataset
const datasetItems = await client
.dataset(run.defaultDatasetId)
.listItems();
datasetItems.items.forEach(item => {
console.log(`Title: ${item.title}`);
console.log(`Segments: ${item.segmentCount}`);
console.log(`Text: ${item.transcriptText.substring(0, 100)}...`);
});

LangChain Integration (LLM + Transcripts)

from langchain.schema import Document
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings
from apify_client import ApifyClient
# Get transcripts via Apify
client = ApifyClient("YOUR_APIFY_TOKEN")
run = client.actor("tugelbay/youtube-transcript").call({
"urls": ["https://www.youtube.com/watch?v=dQw4w9WgXcQ"],
"outputFormat": "json"
})
# Convert to LangChain documents
documents = []
for item in client.dataset(run["defaultDatasetId"]).list_items().items:
doc = Document(
page_content=item["transcriptText"],
metadata={
"source": item["videoUrl"],
"title": item["title"],
"channel": item["channel"],
"language": item["language"]
}
)
documents.append(doc)
# Create vector store
embeddings = OpenAIEmbeddings()
vectorstore = FAISS.from_documents(documents, embeddings)
# Query transcripts with LLM
results = vectorstore.similarity_search("main topics discussed", k=3)
for doc in results:
print(f"From: {doc.metadata['title']}")
print(f"Content: {doc.page_content[:200]}...")

MCP (Model Context Protocol) for Claude / LLM Agents

{
"name": "apify_youtube_transcript",
"description": "Extract transcripts from YouTube videos via Apify",
"url": "https://api.apify.com/v2/actor-tasks/{TASK_ID}/runs",
"params": {
"urls": "array of YouTube URLs",
"outputFormat": "json|srt|vtt|markdown|text",
"language": "language code",
"maxItems": "max videos to process"
}
}

Export to File

Export as JSONL (one video per line):

# After running actor, export dataset as JSONL
curl "https://api.apify.com/v2/datasets/{DATASET_ID}/items?format=jsonl" \
-H "Authorization: Bearer YOUR_APIFY_TOKEN" \
> transcripts.jsonl

Export as CSV:

curl "https://api.apify.com/v2/datasets/{DATASET_ID}/items?format=csv" \
-H "Authorization: Bearer YOUR_APIFY_TOKEN" \
> transcripts.csv

Export as ZIP (all formats):

# Use Apify CLI
apify dataset download {DATASET_ID}

Use Cases

  1. Content Creator Archiving — Extract transcripts from your own YouTube videos for documentation, blog posts, and searchable archives. Bulk process your entire channel in one run.

  2. Research & Literature Review — Transcribe educational videos, conference talks, and webinars. Convert to plain text for NLP analysis, topic modeling, or citation tracking.

  3. SEO & Content Repurposing — Convert video transcripts to blog posts, articles, and social media snippets. Bulk processing means you can refresh your content library in hours instead of weeks.

  4. Accessibility & Subtitle Creation — Generate SRT/VTT subtitles for your video library. For creators with no manual captions, auto-generated fallback ensures every video has a transcript.

  5. Video Search & Indexing — Index YouTube transcripts full-text for internal video search. Extract metadata (title, channel, thumbnail) and segment timestamps for clickable search results.

  6. LLM Fine-Tuning & Training Data — Use video transcripts as training data for AI models. Timestamps allow you to correlate text with video segments for multimodal training.

  7. Podcast & Audio Content Analysis — Transcribe YouTube uploads of podcasts, interviews, and audio documentaries. Markdown format with timestamps works as a readable episode guide.

  8. Educational Curriculum Building — Compile transcripts from course videos. Organize by topic, language, or creator. Convert to Markdown for e-books or learning materials.

  9. Market Research & Competitor Analysis — Extract competitor's video content. Monitor what's being discussed, analyze sentiment, track topic trends.

  10. Subtitling for Non-English Speakers — Request Spanish, French, German, or any language. Auto-generated fallback ensures coverage even for videos with limited captions.


Cost Estimation

YouTube Transcript Scraper uses Pay-Per-Event (PPE) pricing: $0.01 per transcript extracted.

Pricing Examples

ScenarioVideosCostNotes
Single video1$0.01Minimal cost for testing
Small batch10$0.10Daily content review
Medium batch100$1.00Weekly channel archive
Large batch1,000$10.00Monthly bulk project
Bulk processing10,000$100.00Entire channel or research dataset
Failed videosAny$0.00No charge if extraction fails (e.g., video unavailable)

Cost Breakdown

  • Transcript extraction: $0.01 per video
  • Metadata (oEmbed): Included in PPE (no additional cost)
  • Proxy usage: Included in PPE (Apify Proxy overhead absorbed)
  • Format conversion: Included in PPE (JSON, SRT, VTT, Markdown all same price)
  • Failed videos: Free (no charge for videos that error out)

Comparison to Competitors

All competitors on Apify use the same $0.01/PPE model. The difference: This actor processes bulk URLs, saving you:

  • Setup overhead (1 run vs. 10+ runs)
  • Platform overhead (1 dataset vs. 10+ datasets)
  • Time (1 execution vs. 10+ sequential executions)

Real-world example: Processing 100 videos

  • Competitors: 100 separate runs × 10 seconds setup = 16+ minutes total time
  • This actor: 1 run, 30–60 seconds total time, 25x faster

FAQ

Q: Do I need a proxy?

A: Yes. YouTube detects and blocks cloud hosting IPs (where Apify runs). The actor uses Apify Proxy by default. If you disable it, you'll get 403 errors. Custom proxies are supported via the proxyConfiguration parameter.

Q: What if a video doesn't have a transcript?

A: The result includes an error field explaining why: "Video is unavailable or private", "Transcripts are disabled for this video", or "No transcript in requested language". The run continues; you get detailed error info per video. No charge for failed extractions.

Q: How many videos can I process in one run?

A: Up to 10,000 videos per run (configurable via maxItems). There's no hard limit on concurrent processing, but very large batches (>5,000) may take 5–10 minutes. Recommended: batch by 500–1,000 for optimal speed.

Q: Can I get transcripts in multiple languages?

A: Not in a single run. Run the actor once per language. For example, to get both English and Spanish transcripts, run with language: "en" once, then language: "es" on the same URLs. Both results will be in your dataset (use filters to separate them).

Q: What timestamp format does it use?

A: JSON/Markdown: Seconds as decimal (e.g., 12.5 = 12.5 seconds). SRT: HH:MM:SS,mmm (e.g., 00:00:12,500). VTT: HH:MM:SS.mmm (e.g., 00:00:12.500). All formats preserve full precision; you can synchronize subtitles pixel-perfectly.

Q: Does it handle YouTube Shorts?

A: Yes. Shorts with captions/transcripts are supported. Just pass the Shorts URL (e.g., https://www.youtube.com/shorts/dQw4w9WgXcQ). Note: Most Shorts don't have manual captions, so includeAutoGenerated: true is recommended.

Q: Can I use this with LangChain or other AI frameworks?

A: Yes. Use the Apify SDK or REST API to fetch transcripts, convert them to LangChain Document objects, and feed into vector stores, LLMs, or RAG pipelines. See the Integrations section for example code.

Q: What's the difference between "auto-generated" and "manual" captions?

A: Manual: Creator or translator wrote captions, usually more accurate. Auto-generated: YouTube's speech-to-text algorithm, may have errors but covers almost all videos. The isAutoGenerated field tells you which you got. Set includeAutoGenerated: false if you want manual captions only (may result in "no transcript" errors).

Q: Can I filter or transform the output?

A: The actor outputs raw results to the dataset. Use Apify's Data Extraction or post-process with a downstream actor. Or download the dataset (JSON/CSV/JSONL) and transform locally. Example: filter for videos >1,000 segments, extract only transcriptText, convert to Markdown.

Q: How long does it take to process a batch?

A: Typical performance: 100 videos in 30–60 seconds. 1,000 videos in 3–5 minutes. 10,000 videos in 15–30 minutes. Times vary based on Apify Proxy load and internet conditions. Processing is parallelized across multiple workers.


Troubleshooting

Issue: "403 Forbidden" or "Video unavailable"

Cause: YouTube is blocking your request. Usually a cloud IP issue.

Solution:

  1. Ensure proxyConfiguration is enabled (default: Apify Proxy).
  2. Check your Apify account has available proxy credits.
  3. Verify the video URL is public (not private/unlisted).
  4. Try a different proxy or contact Apify support.

Issue: "No transcript available for video {id}"

Cause: Video has no captions (manual or auto-generated) in the requested language.

Solution:

  1. Check the video on YouTube manually — does it have captions?
  2. If yes but in a different language, set language to that language code.
  3. If no captions exist, this video can't be transcribed (no workaround).
  4. Ensure includeAutoGenerated: true (default) to use auto-generated as fallback.

Issue: "Transcripts are disabled for this video"

Cause: Video creator explicitly disabled comments and transcripts.

Solution: None. Creator must enable transcripts in YouTube Studio. You cannot transcribe disabled videos.

Issue: "Request timeout" or "Connection reset"

Cause: Proxy or network latency. Rare but possible with very large batches or slow proxies.

Solution:

  1. Reduce maxItems and rerun (e.g., 500 instead of 5,000).
  2. Try again; transient network errors usually resolve on retry.
  3. Check Apify's proxy status page.
  4. Use custom proxy if available.

Issue: Language fallback gave me wrong language

Cause: Requested language not found; actor fell back to available language.

Explanation: If you request language: "fr" but video only has English and Spanish, you'll get Spanish (first available). Set language: "en" and includeAutoGenerated: false to fail cleanly instead of falling back.

Solution: Check the language field in the result. If it doesn't match your request, manually re-request with explicit language or skip that video.


Limitations

  1. Requires Proxy — YouTube blocks cloud IPs. All runs require a proxy (Apify Proxy or custom). Cost is absorbed in the PPE price.

  2. Manual Captions Only (Optional) — If you disable includeAutoGenerated: true, videos without manual captions will fail. ~70% of YouTube videos rely on auto-generated captions.

  3. No Multilingual Output — Can't extract English and Spanish in one run. Must run twice (once per language). Results go to the same dataset; use filters to separate.

  4. oEmbed Metadata Limitations — Title, channel, and thumbnail come from YouTube's oEmbed API, not direct video pages. Occasionally missing or outdated. Disable with includeMetadata: false to speed up.

  5. Rate Limiting — YouTube and Apify Proxy both rate-limit. Very large batches (>10k) may hit limits. Recommended: split into 1k–2k batches if processing 100k+ videos.

  6. No Video Download — This actor extracts transcripts only, not video audio or metadata like resolution, frame rate, or duration. Use YouTube-DL actors for that.

  7. No Translation — Transcripts are in the video's original language. Can't translate on the fly. Use Google Translate API as a downstream step if needed.

  8. Segment Duration Estimates — Segment duration is calculated from the next segment's start time. Last segment duration may be imprecise.


Changelog

v1.2.0 (Latest)

  • Added: Support for YouTube Shorts URLs
  • Improved: Metadata extraction now handles edge cases (private videos, deleted channels)
  • Fixed: SRT timestamp formatting for videos >1 hour
  • Performance: Parallel processing now handles 10k videos in <2 minutes

v1.1.5

  • Added: Markdown output format with inline timestamps
  • Added: includeMetadata toggle to skip oEmbed API calls for faster processing
  • Fixed: Language fallback now respects includeAutoGenerated flag
  • Fixed: Error handling for videos with no segments

v1.1.0

  • Added: VTT subtitle format output
  • Added: Automatic fallback to auto-generated captions
  • Improved: Error messages now include video ID and language
  • Changed: Default maxItems reduced to 100 (was unlimited)

v1.0.5

  • Fixed: Proxy configuration parsing for custom proxies
  • Fixed: Timestamp precision for segments <1 second
  • Improved: Logging now shows segment count per video

v1.0.0 (Initial Release)

  • Bulk YouTube transcript extraction
  • JSON and SRT output formats
  • Language selection with fallback
  • Video metadata (title, channel, thumbnail)
  • Apify Proxy integration
  • PPE pricing ($0.01/transcript)

Support & Documentation


Monetization Notes

This actor is published on Apify Store with PPE (Pay-Per-Event) pricing at $0.01 per transcript.

  • Billing Model: You (the developer) get 20% commission from all runs.
  • Payout Threshold: Minimum $20 (PayPal) or $100 (wire transfer).
  • Payout Frequency: 11th of each month.
  • Pricing Flexibility: PPE pricing is optimal for high-volume, low-cost operations. Unlike rental models, this scales with usage.
  • AI/MCP Compatible: PPE model is ideal for LLM agents and AI workflows that run continuously.

Questions? Issues? Feedback? Post on the Apify actor discussion page or contact the developer directly.

Happy transcripting!