YouTube Transcript API | Video to Text Scraper for AI avatar

YouTube Transcript API | Video to Text Scraper for AI

Pricing

from $1.00 / 1,000 transcripts

Go to Apify Store
YouTube Transcript API | Video to Text Scraper for AI

YouTube Transcript API | Video to Text Scraper for AI

Extract full transcripts and time-coded captions from any YouTube video. Build custom AI datasets, train LLMs, or repurpose video content.

Pricing

from $1.00 / 1,000 transcripts

Rating

0.0

(0)

Developer

Andok

Andok

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

2

Monthly active users

a day ago

Last modified

Categories

Share

YouTube Transcript Scraper for AI & RAG

Turn any YouTube video into clean, structured text ready for LLM context windows, RAG pipelines, or full-text search. Extracts captions directly from YouTube's internal player data — no browser, no API key, no rate-limit headaches. Process hundreds of videos in parallel with configurable concurrency.

Features

  • Full transcript extraction — pulls the complete caption track from any video with available subtitles
  • Timed segments — returns individual segments with offset and duration for precise referencing
  • Bulk processing — extract transcripts from hundreds of videos in a single run
  • Parallel execution — configurable concurrency to balance speed and reliability
  • AI-ready output — concatenated plain text field perfect for LLM ingestion
  • No API key required — works directly with YouTube's public caption data

Input

FieldTypeRequiredDefaultDescription
urlsarrayYesYouTube video URLs to extract transcripts from
concurrencyintegerNo10Number of videos to process in parallel. Increase for large batches, decrease if you see errors.

Input Example

{
"urls": [
"https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"https://www.youtube.com/watch?v=jNQXAC9IVRw"
],
"concurrency": 10
}

Output

Each video produces one dataset row containing the full transcript as a single text block, plus an array of timed segments for granular access.

Key output fields:

FieldTypeDescription
inputUrlstringThe video URL you provided
videoIdstringYouTube video ID
fullTextstringComplete transcript as a single text block
segmentsarrayIndividual caption segments with timing data
segments[].textstringText content of the segment
segments[].offsetnumberStart time in milliseconds
segments[].durationnumberSegment duration in milliseconds

Output Example

{
"inputUrl": "https://www.youtube.com/watch?v=jNQXAC9IVRw",
"videoId": "jNQXAC9IVRw",
"fullText": "Welcome to the presentation. Today we will cover three key topics in machine learning. First, let us look at supervised learning and how it differs from unsupervised approaches...",
"segments": [
{ "text": "Welcome to the presentation.", "offset": 0, "duration": 2400 },
{ "text": "Today we will cover three key topics", "offset": 2400, "duration": 3100 },
{ "text": "in machine learning.", "offset": 5500, "duration": 1800 }
]
}

Pricing

Pay per event on Apify platform.

EventDescription
TranscriptOne video transcript extracted

Use Cases

  • RAG pipelines — feed video transcripts into vector databases for retrieval-augmented generation
  • Content repurposing — convert video content into blog posts, summaries, or social media threads
  • Research analysis — make lecture and conference talk content searchable and quotable
  • SEO audits — analyze what competitors say in their videos without watching them
  • Training data — build domain-specific datasets from educational YouTube content
ActorWhat it adds
YouTube Transcript MCP ServerAI-powered transcription via MCP — works even on videos without captions
YouTube Video Metadata ExtractorGet views, likes, tags, and publish dates alongside your transcripts
Markdown ExtractorConvert web pages to clean markdown — complement video transcripts with written sources