YouTube Transcript Extractor avatar

YouTube Transcript Extractor

Pricing

Pay per usage

Go to Apify Store
YouTube Transcript Extractor

YouTube Transcript Extractor

Extract transcripts from YouTube videos in bulk. Supports channels, playlists, multiple languages. AI/RAG optimized.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Fulcria Labs

Fulcria Labs

Maintained by Community

Actor stats

0

Bookmarked

6

Total users

2

Monthly active users

1 days

Issues response

6 days ago

Last modified

Categories

Share

Extracts transcripts and captions from YouTube videos in bulk. Outputs clean, structured text optimized for AI/RAG pipelines, content analysis, translation, and research.

What it does

Takes YouTube video URLs (or channel/playlist URLs), extracts available transcripts and captions, and outputs structured text with timestamps. Supports multiple languages and automatic/manual caption types.

Key Features

  • Bulk extraction - Process hundreds of videos in one run
  • Multi-language support - Extract transcripts in any available language
  • Channel/Playlist support - Automatically discover videos from channels and playlists
  • AI-ready output - Clean text format optimized for RAG, embeddings, and LLMs
  • Timestamp preservation - Keep or remove timestamps based on your needs
  • Chunking options - Split transcripts into configurable chunks for embedding pipelines

Input

FieldTypeDefaultDescription
urlsstring[]requiredYouTube video, channel, or playlist URLs
languagesstring[]["en"]Preferred transcript languages (ISO 639-1 codes)
includeAutoGeneratedbooleantrueInclude auto-generated captions
includeTimestampsbooleantrueInclude start/duration timestamps
chunkSizeinteger0Split transcript into chunks of N characters (0 = no chunking)
chunkOverlapinteger200Overlap between chunks in characters
outputFormatstring"structured"Output format: "structured", "plain_text", "srt", "vtt"
maxVideosPerChannelinteger50Max videos to process per channel/playlist
proxyConfigurationobject{}Apify proxy settings

Output

Each video produces a dataset item:

{
"videoId": "dQw4w9WgXcQ",
"videoUrl": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"title": "Video Title",
"channelName": "Channel Name",
"language": "en",
"isAutoGenerated": false,
"availableLanguages": ["en", "es", "fr"],
"transcript": [
{
"text": "Hello and welcome",
"start": 0.0,
"duration": 2.5
}
],
"fullText": "Hello and welcome to this video...",
"wordCount": 1523,
"duration": 612.5,
"chunks": [
{
"index": 0,
"text": "Hello and welcome to this video...",
"startTime": 0.0,
"endTime": 45.2
}
],
"extractedAt": "2026-02-23T07:00:00Z"
}

Use Cases

  • RAG Knowledge Bases - Build searchable knowledge bases from educational YouTube channels
  • Content Research - Analyze competitor content, extract key topics
  • Translation - Extract source text for translation workflows
  • SEO Analysis - Analyze video content for keyword research
  • Podcast Transcription - Many podcasts are uploaded to YouTube with captions
  • Training Data - Collect text data for fine-tuning language models
  • Accessibility - Generate text versions of video content

Pricing

Pay per result:

  • $0.15 per 1,000 videos processed
  • Free tier: 100 videos/month

Example Usage

Extract transcripts from specific videos

{
"urls": [
"https://www.youtube.com/watch?v=VIDEO_ID_1",
"https://www.youtube.com/watch?v=VIDEO_ID_2"
],
"languages": ["en"],
"outputFormat": "structured"
}

Extract from a channel with chunking for RAG

{
"urls": ["https://www.youtube.com/@ChannelName"],
"languages": ["en"],
"chunkSize": 1000,
"chunkOverlap": 200,
"maxVideosPerChannel": 100,
"outputFormat": "structured"
}

Get plain text transcripts in multiple languages

{
"urls": ["https://www.youtube.com/playlist?list=PLAYLIST_ID"],
"languages": ["en", "es", "fr"],
"includeTimestamps": false,
"outputFormat": "plain_text"
}