YouTube Transcript Scraper avatar

YouTube Transcript Scraper

Pricing

from $0.00005 / actor start

Go to Apify Store
YouTube Transcript Scraper

YouTube Transcript Scraper

YouTube Transcript Scraper & Extractor API — Extract transcripts, captions & subtitles from YouTube videos, Shorts & VODs without an API key. Supports auto-generated and manual captions in 100+ languages with translation, batch extraction & clean JSON for AI agents, RAG, SEO & automation.

Pricing

from $0.00005 / actor start

Rating

4.8

(8)

Developer

Akash Kumar Naik

Akash Kumar Naik

Maintained by Community

Actor stats

27

Bookmarked

869

Total users

190

Monthly active users

3.7 hours

Issues response

7 hours ago

Last modified

Share

YouTube Transcript Scraper — Extract Captions & Subtitles from Any YouTube Video

YouTube Transcript Scraper is an Apify Actor that extracts transcripts, captions, and subtitles from YouTube videos in seconds — no YouTube Data API key, OAuth setup, or quota management required.

Simply provide a YouTube URL and receive clean, structured JSON output ready for AI applications, RAG systems, analytics, SEO research, content repurposing, and workflow automation.

🚀 Try it now on the Apify Store.

Key Features

  • 📄 Extract full transcripts and captions from YouTube videos with available subtitles
  • 🤖 Supports both auto-generated and manually created captions
  • 🌍 Extract transcripts in 100+ languages
  • 🌐 Translate transcripts into 14 supported languages
  • ⚡ Typical extraction time: 3–5 seconds
  • 📦 Structured JSON output ready for APIs, databases, and vector stores
  • 🔁 Batch processing support through Apify API
  • 🔒 Automatic proxy rotation for reliable extraction
  • 🚫 No YouTube Data API key required
  • 🚫 No OAuth configuration required
  • ♾️ No quota limitations

Supported Content Types

  • Standard YouTube videos
  • YouTube Shorts
  • Premieres
  • Completed live streams (VODs)
  • Embedded videos
  • Public videos
  • Unlisted videos (when URL is available)

Pricing

You only pay for successful extractions.

EventPrice
Transcript Extraction$0.01

Estimated Cost

VolumeCost
10 Videos~$0.10
100 Videos~$1.00
1,000 Videos~$10.00
10,000 Videos~$100.00

What Is YouTube Transcript Scraper?

YouTube Transcript Scraper is a cloud-based transcript extraction API that converts YouTube captions into structured JSON.

The Actor accepts all major YouTube URL formats and retrieves transcript data without using the official YouTube Data API.

Whether you're:

  • Building AI datasets
  • Creating RAG pipelines
  • Running SEO analysis
  • Generating summaries
  • Monitoring competitors
  • Repurposing video content

this Actor provides a simple URL-in, JSON-out workflow.


Why Use This Instead of the YouTube Data API?

CapabilityYouTube Transcript ScraperYouTube Data API v3
API Key Required❌ No✅ Yes
OAuth Required❌ No✅ Yes
Daily Quota Limits❌ No✅ Yes
Auto-Generated Captions✅ Yes❌ No
Manual Captions✅ Yes⚠️ Limited
Batch Processing✅ Yes⚠️ Quota Limited
Setup ComplexityLowMedium
Proxy Handling✅ Built-in❌ No

The YouTube Data API does not expose auto-generated captions and requires API key management, OAuth setup, and quota monitoring.

YouTube Transcript Scraper eliminates those limitations and provides a straightforward transcript extraction workflow.


Use Cases

AI & Machine Learning

LLM Training

Create high-quality text datasets from YouTube content.

RAG Systems

Extract transcripts and store them in:

  • Pinecone
  • Weaviate
  • Chroma
  • Qdrant
  • Milvus

Index video content for retrieval and question-answering systems.

Content Summarization

Generate summaries using:

  • OpenAI
  • Claude
  • Gemini
  • Mistral

Content Marketing

Blog Generation

Convert YouTube videos into blog posts.

Newsletter Creation

Turn videos into newsletter content.

Social Media Repurposing

Extract clips, quotes, and content ideas.

Podcast Show Notes

Generate structured show notes automatically.


SEO & Research

Competitor Analysis

Analyze competitor video content.

Keyword Discovery

Extract keywords from transcript content.

Content Gap Analysis

Identify frequently discussed topics.


Accessibility

Transcript Generation

Provide accessible text versions of video content.

Internal Documentation

Convert webinars and training videos into searchable documentation.


Automation

Integrate with:

  • n8n
  • Make
  • Zapier
  • Activepieces
  • Custom APIs

Supported Languages

The Actor can extract any caption language available on the target video.

Examples include:

  • English (en)
  • Hindi (hi)
  • Spanish (es)
  • French (fr)
  • German (de)
  • Portuguese (pt)
  • Japanese (ja)
  • Korean (ko)
  • Chinese (zh)
  • Arabic (ar)

and many more.

Language Selection

If language is omitted, the Actor automatically returns the video's default caption track.

To request a specific language, provide a BCP-47 language code.

Example:

{
"videoUrl": "https://youtu.be/dQw4w9WgXcQ",
"language": "hi"
}

Translation Support

The Actor supports transcript translation into:

  • English
  • Spanish
  • French
  • German
  • Italian
  • Portuguese
  • Russian
  • Japanese
  • Korean
  • Chinese
  • Arabic
  • Hindi
  • Dutch
  • Polish

Translation uses Mistral AI.

When translation is enabled:

  • Original transcript is always returned
  • Translation is attempted only when source and target languages differ
  • transcript_translated is returned upon success

If translation fails, the original transcript is still returned.


Input Parameters

ParameterTypeRequiredDefaultDescription
videoUrlstringYes-YouTube URL or video ID
languagestringNoAuto-detectCaption language
translatebooleanNofalseEnable translation

Supported URL Formats

https://www.youtube.com/watch?v=VIDEO_ID
https://youtu.be/VIDEO_ID
https://youtube.com/shorts/VIDEO_ID
https://youtube.com/live/VIDEO_ID
https://youtube.com/embed/VIDEO_ID
VIDEO_ID

Example Input

{
"videoUrl": "https://youtu.be/dQw4w9WgXcQ",
"language": "en",
"translate": false
}

Output Schema

FieldTypeDescription
successbooleanExtraction status
video_idstringYouTube video ID
video_urlstringCanonical URL
video_titlestringVideo title
published_atstringPublish date
thumbnail_max_hd_urlstringThumbnail URL
transcriptstringExtracted transcript
transcript_translatedstringTranslated transcript
translated_languagestringTranslation language
languagestringCaption language
extraction_timenumberProcessing time
proxy_typestringProxy type used
proxy_statsobjectProxy statistics
timestampstringExtraction timestamp
errorstringError message

Example Output

{
"success": true,
"video_id": "dQw4w9WgXcQ",
"video_url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"video_title": "Example Video",
"published_at": "Jan 15, 2026",
"thumbnail_max_hd_url": "https://i.ytimg.com/vi/dQw4w9WgXcQ/maxresdefault.jpg",
"transcript": "Full transcript text...",
"language": "en",
"extraction_time": 3.42,
"proxy_type": "apify",
"timestamp": "2026-05-18T12:00:00.000Z"
}

Example Output With Translation

{
"success": true,
"video_id": "qMquIcJWZag",
"video_title": "Example Video",
"language": "en",
"translated_language": "es",
"transcript": "Original transcript...",
"transcript_translated": "Transcripción traducida..."
}

Error Responses

No Captions Available

{
"success": false,
"video_id": "xxxxxxxxxxx",
"error": "No captions available for this video"
}

Private Video

{
"success": false,
"video_id": "xxxxxxxxxxx",
"error": "Video is private or unavailable"
}

Extraction Failure

{
"success": false,
"video_id": "xxxxxxxxxxx",
"error": "Unable to extract transcript"
}

API Usage

cURL

curl -X POST \
"https://api.apify.com/v2/acts/akash9078/youtube-transcript-scraper/runs" \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"videoUrl":"https://youtu.be/dQw4w9WgXcQ",
"language":"es",
"translate":true
}'

Python

import requests
response = requests.post(
"https://api.apify.com/v2/acts/akash9078/youtube-transcript-scraper/runs",
headers={
"Authorization": "Bearer YOUR_API_TOKEN"
},
json={
"videoUrl": "https://youtu.be/dQw4w9WgXcQ",
"language": "es",
"translate": True
}
)
print(response.json())

Node.js

const { ApifyClient } = require('apify-client');
const client = new ApifyClient({
token: process.env.APIFY_TOKEN
});
const run = await client.actor(
'akash9078/youtube-transcript-scraper'
).call({
videoUrl: 'https://youtu.be/dQw4w9WgXcQ',
language: 'es',
translate: true
});
console.log(run.defaultDatasetId);

Retrieving Results

Each run stores output in an Apify Dataset.

Get Dataset Items

$GET https://api.apify.com/v2/datasets/{DATASET_ID}/items

Node.js Example

const { items } = await client
.dataset(run.defaultDatasetId)
.listItems();
console.log(items);

Integrations

AI Platforms

  • OpenAI
  • Anthropic Claude
  • Gemini
  • LangChain
  • LlamaIndex

Vector Databases

  • Pinecone
  • Chroma
  • Weaviate
  • Qdrant

Automation Tools

  • n8n
  • Make
  • Zapier
  • Activepieces

Data Platforms

  • Google Sheets
  • Airtable
  • Notion
  • BigQuery

Alternative Use Cases

YouTube Transcript Scraper can be used as:

  • YouTube Transcript API
  • YouTube Caption API
  • YouTube Subtitle API
  • YouTube Transcript Downloader
  • YouTube Subtitle Extractor
  • YouTube Caption Scraper
  • YouTube Subtitle Scraper
  • YouTube Transcript Scraper

Limitations

  • Videos must have available captions
  • Live streams currently in progress are not supported
  • Private videos are not supported
  • Age-restricted videos are not supported
  • Transcript extraction depends on caption availability

FAQ

Does the Actor support auto-generated captions?

Yes. Both manual and auto-generated captions are supported.


Can I extract transcripts in bulk?

Yes. You can process thousands of videos using the Apify API.


Does it work with YouTube Shorts?

Yes. Shorts are fully supported.


Is a YouTube API key required?

No. The Actor does not require a YouTube Data API key.


Does the Actor return timestamped captions?

No. The current output contains complete transcript text only.


Can I translate transcripts?

Yes. Translation into 14 supported languages is available.


Is there a daily limit?

No. There are no YouTube API quota restrictions.


How fast is transcript extraction?

Most videos are processed within 3–5 seconds.


Support

  • Apify Store Actor Page
  • Apify Discord Community
  • GitHub Issues (if applicable)

Built and maintained by akash9078 on the Apify platform.