YouTube Transcript Extractor — AI-Ready Subtitles avatar

YouTube Transcript Extractor — AI-Ready Subtitles

Under maintenance

Pricing

from $8.00 / 1,000 results

Go to Apify Store
YouTube Transcript Extractor — AI-Ready Subtitles

YouTube Transcript Extractor — AI-Ready Subtitles

Under maintenance

Extracts subtitles/transcripts from YouTube videos. Input a video URL or ID, get clean text output with metadata. Ideal for AI training data collection, content analysis, and LLM training pipelines.

Pricing

from $8.00 / 1,000 results

Rating

0.0

(0)

Developer

陈俊杰

陈俊杰

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a month ago

Last modified

Share

Extract clean subtitle/transcript text from any YouTube video with subtitles. Designed for AI training data pipelines, content analysis, and LLM training.

Features

  • 🎯 Input a YouTube URL or bare video ID
  • 🌐 Supports manual and auto-generated captions
  • 🌍 Multi-language — specify any ISO 639-1 language code (default: en)
  • ⏱ Optional [MM:SS] timestamps in output
  • 🧹 Clean, join-transcript format
  • 📊 Rich metadata: video_id, duration, word count, language
  • 🛡️ Robust error handling with descriptive error messages

Input

FieldTypeRequiredDefaultDescription
video_urlstringYouTube URL (any format) or bare video ID
languagestringenISO 639-1 language code
include_timestampsboolfalseAdd [MM:SS] before each subtitle line

Output (one item per run)

FieldTypeDescription
video_idstring11-char YouTube video ID
titlestringVideo title (if retrievable)
durationintApproximate duration in seconds
languagestringLanguage code of the transcript
transcript_typestring"manual" or "auto-generated"
transcriptstringFull clean text of the subtitles
word_countintWord count of the transcript
urlstringFull YouTube URL

Supported URL formats

  • https://www.youtube.com/watch?v=VIDEO_ID
  • https://youtu.be/VIDEO_ID
  • https://www.youtube.com/embed/VIDEO_ID
  • https://www.youtube.com/shorts/VIDEO_ID
  • Bare VIDEO_ID (11 characters)

Use Cases

  • AI/LLM Training Data — collect natural language text from millions of YouTube videos
  • Content Analysis — analyze video content at scale for SEO, research, or moderation
  • Accessibility — extract captions for further processing or translation
  • Dataset Building — build large text corpora from video subtitles

Built with youtube_transcript_api ❤️