YouTube Transcript Extractor with Multilingual Translation avatar

YouTube Transcript Extractor with Multilingual Translation

Pricing

from $0.00005 / actor start

Go to Apify Store
YouTube Transcript Extractor with Multilingual Translation

YouTube Transcript Extractor with Multilingual Translation

YouTube Transcript Extractor & Scraper API — Extract transcripts, captions & subtitles from YouTube videos, Shorts & VODs without an API key. Supports auto-generated and manual captions in 100+ languages with translation, batch extraction & clean JSON for AI agents, RAG, SEO & automation.

Pricing

from $0.00005 / actor start

Rating

4.8

(8)

Developer

Akash Kumar Naik

Akash Kumar Naik

Maintained by Community

Actor stats

24

Bookmarked

815

Total users

194

Monthly active users

3.7 hours

Issues response

a day ago

Last modified

Share

YouTube Transcript Extractor — Extract Captions & Subtitles from Any YouTube Video

YouTube Transcript Extractor is an Apify Actor that pulls full transcripts, subtitles, and captions from any YouTube video in seconds — no YouTube Data API key required, no OAuth, no quota limits. Just a YouTube URL in, clean structured JSON out.

🚀 Try it now: YouTube Transcript Extractor on Apify Store →

Supports 100+ languages for extraction and 14 languages for translation, works with regular videos, Shorts, Premieres, live VODs, and embeds. Auto-generated and manual captions are both supported.


What Is YouTube Transcript Extractor?

YouTube Transcript Extractor is a cloud-based transcript API that converts YouTube video captions into structured plain-text JSON. It accepts every standard YouTube URL format, bypasses the official API entirely, and delivers results in 3–5 seconds per video.

Whether you're building an AI training data pipeline, ingesting transcripts into a RAG system, repurposing content, or conducting YouTube SEO research — this Actor gets you there without touching Google Cloud.

✅ What Can YouTube Transcript Extractor Do?

  • 📄 Extract full transcripts and captions from any YouTube video with subtitles
  • 🌍 Detect and retrieve captions in 100+ languages (BCP-47 language code support)
  • 🌐 Auto-translate transcripts to 14 languages — returns both original and translated text (premium add-on, see pricing below)
  • ⚡ Process videos in 3–5 seconds per transcript
  • 🤖 Access auto-generated captions — something the official YouTube Data API v3 cannot do
  • 📦 Return structured JSON output ready for databases, vector stores, and LLMs
  • 🔁 Handle batch processing of thousands of videos via the Apify API — no daily quota
  • 🔒 Use residential proxy rotation to avoid YouTube IP blocks automatically
  • 🔗 Accept every YouTube URL format: youtube.com/watch, youtu.be, shorts/, live/, embed/, and bare video IDs

Why Use YouTube Transcript Extractor Instead of the YouTube Data API?

CapabilityYouTube Transcript ExtractorYouTube Data API v3
API Key Required❌ No✅ Yes
Daily Quota♾️ Unlimited⚠️ 10,000 units/day
Auto-Generated Captions✅ Yes❌ No
Manual Captions✅ Yes✅ Limited
Setup ComplexityURL in → JSON outOAuth + GCP project
Batch Processing✅ Unlimited⚠️ Quota-limited
Cost ModelPay per transcriptFree quota + paid overflow

Bottom line: The YouTube Data API v3 doesn't expose auto-generated captions, restricts daily usage to 10,000 quota units, and requires OAuth setup. YouTube Transcript Extractor skips all of that and gives you clean transcript data for any video — at scale.


What Data Does YouTube Transcript Extractor Return?

FieldTypeDescription
video_idstringParsed YouTube video ID
video_urlstringCanonical YouTube watch URL
video_titlestringFull video title
published_atstringVideo publish date (when available)
thumbnail_max_hd_urlstringHD thumbnail URL (when available)
transcriptstringComplete plain-text transcript in the original video language
transcript_translatedstringTranslated transcript (only when translate: true and target language differs from original)
translated_languagestringBCP-47 language code of the translated transcript
languagestringBCP-47 language code of the extracted captions
extraction_timenumberTime taken to extract, in seconds
proxy_typestringType of proxy used: apify, direct, or none
proxy_statsobjectProxy rotation telemetry: attempts, session IDs, proxy URLs, errors
timestampstringISO 8601 UTC timestamp of the run
errorstringError message if extraction failed

How to Extract a YouTube Transcript in 3 Steps

  1. Open the Actor — Go to YouTube Transcript Extractor on Apify Store and click Try for free
  2. Paste your YouTube URL — Enter any YouTube URL (video, Short, Premiere, embed) into the videoUrl field. Optionally set a language BCP-47 code (e.g., en, hi, es). Enable the translate toggle to automatically translate the transcript to your chosen language
  3. Run and download — Click Start and get a clean JSON transcript within seconds. Export to CSV, JSON, or push directly to your pipeline

💡 No coding required. Run directly in the Apify Console UI. Use the API or SDK to automate at scale.


Use Cases for YouTube Transcript Extractor

🤖 AI, ML & NLP Pipelines

  • LLM Training Data — build high-quality text corpora from YouTube content at scale
  • RAG Pipeline Ingestion — chunk and embed transcripts into Pinecone, Weaviate, or Chroma for semantic search
  • Sentiment & Topic Analysis — process large transcript volumes for NLP research
  • AI Content Generation — feed transcripts to LLMs for summarization, Q&A, and repurposing

✍️ Content Creation & Marketing

  • Content Repurposing — convert YouTube videos into blog posts, newsletters, and social captions
  • Multilingual Content — auto-translate transcripts to reach global audiences (14 languages supported)
  • YouTube SEO Research — extract transcript text to surface keywords competitors rank for in video search
  • Video Summarization — auto-generate descriptions and show notes using LLMs

🏢 Business & Accessibility

  • Accessibility Compliance — produce ADA and WCAG 2.1 compliant transcripts
  • Internal Knowledge Base — convert training videos and webinars into searchable documentation
  • Competitive Intelligence — analyze industry YouTube channels at scale

👨‍💻 Developers & Automation

  • REST API Integration — simple POST request, JSON response, no SDK required
  • Workflow Automation — connect with n8n, Make (Integromat), Zapier, or Activepieces
  • Scheduled Channel Monitoring — use Apify's built-in scheduler to track new uploads automatically

How Much Does It Cost to Extract YouTube Transcripts?

YouTube Transcript Extractor uses pay-per-event pricing — you only pay for successful extractions.

EventPriceWhen Charged
transcript-extracted$0.01Successfully extracted a YouTube video transcript
proxy-usage$0.0005Residential proxy used to bypass YouTube IP blocks
translation$0.002Transcript translated to the requested language (only when translate is enabled and translation succeeds)

Typical cost per video: ~$0.0105 (or ~$0.0125 with translation enabled)

VolumeBase Cost (no translation)With Translation
10 transcripts~$0.11~$0.13
100 transcripts~$1.05~$1.25
1,000 transcripts~$10.50~$12.50
10,000 transcripts~$105.00~$125.00

🎁 New Apify accounts receive free credits — test YouTube Transcript Extractor before you commit.


Input Parameters

ParameterTypeRequiredDefaultDescription
videoUrlstring✅ Yeshttps://youtu.be/dQw4w9WgXcQYouTube URL or bare 11-character video ID
languagestringNoenBCP-47 language code. Auto-detects if omitted
translatebooleanNofalseEnable to translate transcript to the chosen language when it differs from the original

Supported YouTube URL Formats

https://www.youtube.com/watch?v=VIDEO_ID
https://youtu.be/VIDEO_ID
https://youtube.com/shorts/VIDEO_ID
https://youtube.com/live/VIDEO_ID
https://youtube.com/embed/VIDEO_ID
VIDEO_ID (bare 11-character ID)

Example Input

{
"videoUrl": "https://youtu.be/dQw4w9WgXcQ",
"language": "en",
"translate": false
}

Output Schema

The Actor returns a structured JSON object for each video. The output always includes the original transcript and, when a target language is requested that differs from the video's original language, a dynamically named translated transcript field.

Default Output (Auto-Detected Language)

{
"success": true,
"video_id": "dQw4w9WgXcQ",
"video_url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"video_title": "Example Video Title",
"published_at": "Jan 15, 2026",
"thumbnail_max_hd_url": "https://i.ytimg.com/vi/dQw4w9WgXcQ/maxresdefault.jpg",
"transcript": "Full transcript text extracted from the video...",
"language": "en",
"extraction_time": 3.45,
"proxy_type": "apify",
"proxy_stats": {
"attempts": 1,
"proxy_urls": ["http://proxy.apify.com:8000"],
"session_ids": ["txdQw4w9_a1b2c3d4_a1"],
"errors": []
},
"timestamp": "2026-04-26T12:00:00.000000+00:00"
}

Output with Translation (e.g., language: "es")

When you request a specific language different from the video's original language, the Actor attempts automatic translation and adds a dynamically named field:

{
"success": true,
"video_id": "qMquIcJWZag",
"video_url": "https://youtu.be/qMquIcJWZag",
"video_title": "There's an Actor for That │ 30,000+ Tools on Apify Store",
"published_at": "Jan 08, 2026",
"thumbnail_max_hd_url": "https://i.ytimg.com/vi/qMquIcJWZag/maxresdefault.jpg",
"transcript": "What if you never had to build another web scraper again? ...",
"transcript_translated": "¿Qué pasaría si nunca más tuvieras que crear otro web scraper? ...",
"translated_language": "es",
"language": "en",
"extraction_time": 8.68,
"proxy_type": "apify",
"proxy_stats": {
"attempts": 1,
"proxy_urls": ["http://proxy.apify.com:8000"],
"session_ids": ["txqMquIc_a1b2c3d4_a1"],
"errors": []
},
"timestamp": "2026-05-18T02:03:49.662525+00:00"
}

Translation behavior:

  • If the requested language matches the video's original language → only transcript is returned
  • If the requested language differs from the original → transcript_translated and translated_language fields are added
  • Translation uses Mistral AI (mistral-small-latest)
  • Supported target languages: en, es, fr, de, it, pt, ru, ja, ko, zh, ar, hi, nl, pl
  • Translation is a premium add-on charged only when the translate toggle is enabled and translation succeeds

API Integration

cURL

curl -X POST "https://api.apify.com/v2/acts/akash9078/youtube-transcript-extractor/runs" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-d '{"videoUrl": "https://youtu.be/dQw4w9WgXcQ", "language": "es", "translate": true}'

Python

import requests
response = requests.post(
'https://api.apify.com/v2/acts/akash9078/youtube-transcript-extractor/runs',
headers={'Authorization': 'Bearer YOUR_API_TOKEN'},
json={
'videoUrl': 'https://youtu.be/dQw4w9WgXcQ',
'language': 'es',
'translate': True
}
)
print(response.json())

Node.js

const { ApifyClient } = require('apify-client');
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('akash9078/youtube-transcript-extractor').call({
videoUrl: 'https://youtu.be/dQw4w9WgXcQ',
language: 'es',
translate: true
});
console.log(run.defaultDatasetId);

Integrations

YouTube Transcript Extractor works seamlessly with the full Apify platform ecosystem:

AI / ML Platforms: OpenAI, Anthropic Claude, LangChain, LlamaIndex, Pinecone, Chroma, Weaviate

Automation Tools: n8n, Make (Integromat), Zapier, Activepieces

Data Platforms: Google Sheets, Airtable, Notion, BigQuery

Apify Platform Features:

  • 📅 Scheduling — run extractions on a cron schedule automatically
  • 📊 Monitoring — track run history, success rates, and errors in the Apify Console
  • 🔗 Webhooks — trigger downstream workflows when transcripts are ready
  • 🗂️ Dataset storage — results saved to Apify's cloud storage, downloadable as JSON or CSV
  • 🔗 Actor chaining — combine with YouTube Channel Video Scraper to build full pipeline workflows

Limitations

  • Videos must have captions enabled (manual or auto-generated). Videos with no captions at all cannot be extracted.
  • Live streams currently in progress are not supported — completed VODs work correctly
  • Private and age-restricted videos are not supported
  • Unlisted videos are accessible if you have the direct URL

FAQ

Does YouTube Transcript Extractor support auto-generated captions?

Yes. Both manual (human-written) and YouTube's auto-generated captions are fully supported. Auto-generated captions are not accessible via the official YouTube Data API v3 — this Actor fills that gap.

Which languages are supported for transcript extraction?

All languages present in a video's caption tracks — 100+ languages. Specify any BCP-47 language code (e.g., hi for Hindi, es for Spanish, ja for Japanese, zh-Hans for Simplified Chinese) or omit the language parameter to auto-detect. When translate is enabled and the requested language differs from the video's original, the Actor returns both the original and translated transcripts. Note that translation is limited to 14 target languages: English, Spanish, French, German, Italian, Portuguese, Russian, Japanese, Korean, Chinese, Arabic, Hindi, Dutch, and Polish.

Can I extract YouTube transcripts in bulk?

Yes. Use the Apify API or SDK to submit multiple video URLs in sequence or in parallel. There are no daily quota limits — extract thousands of transcripts per day.

How fast is YouTube transcript extraction?

3–5 seconds per video on average, depending on transcript length and server load. Bulk runs benefit from Apify's parallel compute scaling.

Does YouTube Transcript Extractor work with YouTube Shorts?

Yes. All standard Shorts URLs (youtube.com/shorts/VIDEO_ID) are fully supported.

Is this a free YouTube transcript extractor?

The Actor uses pay-per-event pricing at ~$0.0105 per video (~$0.0125 with translation enabled). New Apify accounts get free platform credits, so you can extract transcripts for free to start.

Can I use this as a YouTube Transcript API without an API key?

Yes. YouTube Transcript Extractor requires only an Apify API token — no YouTube Data API key, no Google Cloud project, no OAuth flow.

How does this compare to other YouTube caption extractors?

YouTube Transcript Extractor uses Apify's residential proxy network with automatic session-based IP rotation, which makes it far more reliable than open-source scripts that break when YouTube changes its response format. It also runs in the cloud — no local setup needed.


Support


Built and maintained by akash9078 on the Apify platform.