๐Ÿงพ YouTube Extractor (Transcripts + Metadata) avatar
๐Ÿงพ YouTube Extractor (Transcripts + Metadata)

Pricing

Pay per event

Go to Apify Store
๐Ÿงพ YouTube Extractor (Transcripts + Metadata)

๐Ÿงพ YouTube Extractor (Transcripts + Metadata)

๐ŸŽฅExtract complete transcripts with precise timestamps โฑ๏ธ and comprehensive video metadata from any YouTube video โ€”> โšกFast, reliable, and ready to use.

Pricing

Pay per event

Rating

4.9

(16)

Developer

dz_omar

dz_omar

Maintained by Community

Actor stats

55

Bookmarked

479

Total users

67

Monthly active users

10 hours

Issues response

10 days ago

Last modified

Share

๐ŸŽฌ YouTube Transcript & Metadata Extractor

Extract complete transcripts with precise timestamps and comprehensive video metadata from any YouTube video - fast, reliable, and built for scale.

This powerful Apify actor provides professional-grade extraction of YouTube transcripts and video information. Perfect for developers, researchers, content creators, and businesses who need accurate, structured video data for analysis, automation, or content repurposing.

YouTube Transcript Extractor


โœจ Key Features

๐ŸŽฏ Complete Data Extraction

  • Full Transcripts with second-by-second precision timestamps
  • Rich Video Metadata including title, views, likes, and publication date
  • Detailed Channel Information with name, ID, subscribers, and verification status
  • Video Analytics including word count and estimated duration
  • Thumbnail URLs for visual reference

โšก Performance & Reliability

  • Lightning Fast - Optimized crawler for maximum speed
  • Smart Caching - Avoid re-processing with built-in cache system
  • Dual-Mode Processing - Free mode for reliability, paid mode for dedicated performance
  • Batch Processing - Handle multiple videos in a single run
  • Automatic Fallback - Seamless switching between processing modes

๐Ÿ› ๏ธ Developer-Friendly

  • Clean JSON Output - Ready for any data pipeline or API integration
  • Flexible Configuration - Control timestamps, cleaning level, and caching
  • Multiple Views - Pre-configured data views for different use cases
  • Well-Documented - Clear examples and comprehensive documentation

๐Ÿ”„ How It Works

  1. Input YouTube URLs - Provide one or more video links
  2. Configure Options - Choose transcript cleaning level, timestamp inclusion, and caching preferences
  3. Run the Actor - Fast, automated extraction begins
  4. Get Structured Results - Download complete data in JSON, CSV, or Excel format

Comment Extraction (Optional)

NEW: This actor can now extract YouTube comments alongside transcripts and metadata using the YouTube Comments Scraper in Standby Mode.

How It Works

When you enable comment extraction, this actor connects to a separate YouTube Comments Scraper instance running in Standby Mode. This architecture provides several advantages:

Benefits of Standby Mode Integration:

  • Faster Processing - No actor startup delays for comment extraction
  • Real-time Streaming - Comments are extracted and pushed immediately as they become available
  • Automatic Resumption - If interrupted, the actor automatically resumes from where it left off
  • Efficient Resource Usage - Dedicated comment scraping infrastructure optimized for performance
  • Parallel Processing - Metadata and comments are extracted concurrently

Configuration

To enable comment extraction, set extractcomments: true in your input:

{
"youtubeUrl": [
{ "url": "https://www.youtube.com/watch?v=kOO31qFmi9A" }
],
"cleaningLevel": "mild",
"includeTimestamps": true,
"extractcomments": true,
"sortBy": "top",
"maxComments": 50,
"maxRepliesPerComment": 5
}

Comment Parameters

ParameterTypeDefaultDescription
extractcommentsbooleanfalseEnable comment extraction
sortBystring"top"Sort method: "top" (most relevant) or "newest" (most recent)
maxCommentsinteger10Maximum number of top-level comments per video (10-100,000)
maxRepliesPerCommentinteger0Maximum replies per comment. Set to 0 to disable replies (faster)

Output with Comments

When comments are enabled, your output includes:

{
"videoId": "kOO31qFmi9A",
"VideoURL": "https://youtu.be/kOO31qFmi9A",
"Video_title": "Excel in Motion | The Beauty of Data Automation",
"hasTranscript": true,
"transcriptText": "Full transcript...",
"timestamps": [...],
"commentsExtracted": true,
"commentCount": 50,
"comments": [
{
"commentId": "UgxQe-6VK3h-LZaul6x4AaABAg",
"authorName": "@username",
"text": "Great video!",
"likeCount": "150",
"replyCount": 5,
"publishedTime": "2 days ago",
"replies": [...]
}
]
}

Pricing

Comment extraction uses the YouTube Comments Scraper actor, which has a pay-per-event pricing model:

Comment Extraction Costs:

  • Actor Start: $0.01 (one-time per run)
  • Parent Comments: $0.003 per comment (no discount tier)
  • Reply Comments: $0.0015 per reply (no discount tier)

Pricing Tiers (with Apify subscription):

  • Bronze: 50% discount on parent comments, 47% discount on replies
  • Silver: 67% discount on parent comments, 60% discount on replies
  • Gold: 73% discount on parent comments, 73% discount on replies

Example Cost Calculation:

  • 50 parent comments + 100 replies (no discount):
    • Start: $0.01
    • Comments: 50 ร— $0.003 = $0.15
    • Replies: 100 ร— $0.0015 = $0.15
    • Total: $0.31

For detailed pricing information, visit: YouTube Comments Scraper Pricing

Performance Considerations

Enabling comments will increase runtime:

  • Without replies: Adds ~5-10 seconds per video
  • With replies (10 per comment): Adds ~20-40 seconds per video depending on reply count

Best Practices:

  1. Set maxRepliesPerComment: 0 if you don't need replies (10x faster)
  2. Start with maxComments: 50 to test, then scale up
  3. Use sortBy: "top" for most relevant comments
  4. Use sortBy: "newest" for real-time monitoring

Technical Details

This actor uses the YouTube Comments Scraper's Standby Mode API endpoint:

https://dz-omar--youtube-comments-scraper.apify.actor/

The integration:

  1. Automatically starts the comment scraper in Standby Mode (if needed)
  2. Streams comments in real-time using NDJSON format
  3. Handles interruptions and automatically resumes
  4. Merges comment data with video metadata seamlessly

For more information about the comment extraction system, see:

๐Ÿ“ฅ Input Configuration

Simple Input Example

{
"youtubeUrl": [
{ "url": "https://www.youtube.com/watch?v=nC8ilIMH8sk" },
{ "url": "https://youtu.be/tpHZYImuhZg" }
],
"cleaningLevel": "mild",
"includeTimestamps": true
}

With Comment Extraction

{
"youtubeUrl": [
{ "url": "https://www.youtube.com/watch?v=kOO31qFmi9A" }
],
"cleaningLevel": "mild",
"includeTimestamps": true,
"extractcomments": true,
"sortBy": "top",
"maxComments": 50,
"maxRepliesPerComment": 5
}

Input Parameters

ParameterTypeDefaultDescription
youtubeUrlarrayRequiredList of YouTube video URLs (supports all formats: youtube.com, youtu.be, shorts, live)
cleaningLevelstringmildControl transcript cleaning: none (raw), mild (remove filler words like "uh", "um"), or aggressive (remove conversational fluff)
includeTimestampsbooleantrueInclude detailed timestamp data for each transcript segment
extractcommentsbooleanfalseEnable comment extraction using YouTube Comments Scraper
sortBystringtopComment sort method: top (most relevant) or newest (most recent)
maxCommentsinteger10Maximum number of top-level comments to extract per video (10-100,000)
maxRepliesPerCommentinteger0Max replies per comment. Set to 0 to disable replies (faster extraction)

Supported URL Formats

  • https://www.youtube.com/watch?v=VIDEO_ID
  • https://youtu.be/VIDEO_ID
  • https://www.youtube.com/shorts/VIDEO_ID
  • https://www.youtube.com/live/VIDEO_ID

๐Ÿ“ค Complete Output Structure

Each video produces a comprehensive JSON object with all available data:

{
"videoId": "1TThGG6guf0",
"VideoURL": "https://youtu.be/1TThGG6guf0",
"embedUrl": "https://www.youtube.com/embed/1TThGG6guf0",
"Video_title": "#51 WordPress Custom widget development",
"published_Date": "Aug 12, 2020",
"Views": "5,067 views",
"likes": "122",
"channel": {
"name": "Imran Sayed - Codeytek Academy",
"id": "UC0SDxbLAqoKLACyEPz2wXAg",
"url": "https://www.youtube.com/channel/UC0SDxbLAqoKLACyEPz2wXAg",
"subscribers": "33.1K subscribers",
"verified": false
},
"thumbnail": "https://i.ytimg.com/vi/1TThGG6guf0/default.jpg",
"Description": "create custom widget wordpress without plugin...",
"hasTranscript": true,
"transcriptText": "hello and welcome everyone to another episode...",
"timestamps": [
{
"time": "0:08",
"text": "hello and welcome everyone to another"
},
{
"time": "0:10",
"text": "episode of advanced wordpress theme"
}
]
}

Output Fields Explained

Video Information

  • videoId - Unique YouTube video identifier
  • VideoURL - Direct link to the video
  • embedUrl - Embeddable video URL
  • Video_title - Full video title
  • published_Date - Publication or stream date
  • Views - View count as displayed on YouTube
  • likes - Like count (when available)
  • thumbnail - Video thumbnail image URL
  • Description - Full video description

Channel Information

  • channel.name - Channel name
  • channel.id - Unique channel identifier
  • channel.url - Direct channel link
  • channel.subscribers - Subscriber count
  • channel.verified - Verification status badge

Transcript Data

  • hasTranscript - Boolean indicating transcript availability
  • transcriptText - Complete cleaned transcript as continuous text
  • timestamps - Array of time-stamped transcript segments (when included)

๐Ÿ“Š Pre-Configured Data Views

1. Full Video Metadata & Transcripts

Complete dataset with all fields - perfect for comprehensive analysis.

Fields: All video metadata, channel info, transcript, timestamps, analytics

2. Pure Transcript Results

Focused view for transcript-only needs.

Fields: Thumbnail, URL, title, transcript text, timestamps, word count, duration

3. Channels Overview

Channel-focused view for creator analysis.

Fields: Video info, channel name, ID, URL, subscribers, verification status


โš™๏ธ Processing Modes

Free Mode

  • Standard processing speed
  • Suitable for small to medium batches
  • No additional costs
  • Reliable and stable performance
  • Dedicated infrastructure for accelerated processing
  • Optimized for high-volume batch operations
  • Faster extraction of large video collections
  • Automatic fallback to free mode if needed

Both modes extract identical, high-quality data. Paid mode is ideal when you need faster throughput for large batches.


๐Ÿ’ก Professional Use Cases

Content Creation & Marketing

  • Content Repurposing - Transform videos into blogs, social media posts, or newsletters
  • SEO Optimization - Extract text content for better search engine indexing
  • Subtitle Generation - Create or verify subtitle files for accessibility
  • Quote Extraction - Find and highlight key moments for marketing materials

Research & Education

  • Academic Research - Analyze lectures, interviews, and educational content at scale
  • Sentiment Analysis - Process video content for NLP and sentiment research
  • Language Learning - Create study materials from foreign language videos
  • Content Analysis - Study speaking patterns, topics, and engagement metrics

Automation & Integration

  • AI/ML Training Data - Feed structured transcripts into machine learning models
  • Chatbot Training - Use video content to train conversational AI
  • API Integration - Seamlessly integrate with your existing data pipelines
  • Automated Workflows - Trigger actions based on new video content

Business Intelligence

  • Competitor Analysis - Monitor and analyze competitor video content
  • Brand Monitoring - Track mentions and sentiment in video content
  • Market Research - Extract insights from industry thought leaders
  • Customer Feedback - Analyze product reviews and testimonials from videos

โš™๏ธ Technical Advantages

Built for Performance

  • Optimized Crawler - Uses CheerioCrawler for maximum speed and efficiency
  • Parallel Processing - Handle multiple videos simultaneously
  • Smart Deduplication - Automatically skip duplicate URLs
  • Efficient Memory Usage - Configurable memory limits (128MB - 512MB)

Reliable & Robust

  • Error Handling - Graceful handling of missing transcripts or blocked videos
  • Retry Logic - Automatic retry on temporary failures
  • Cache System - Persistent storage of processed videos
  • Intelligent Fallback - Seamless mode switching when needed

Universal Compatibility

  • Multi-language Support - Works with all languages supported by YouTube
  • All Video Types - Standard videos, shorts, live streams, premieres
  • Export Formats - JSON, CSV, Excel, or direct API access
  • Apify Ecosystem - Seamlessly integrates with other Apify actors

Performance Metrics

  • Speed: 5-10 seconds per typical video in free mode; faster with dedicated infrastructure
  • Accuracy: Extracts official YouTube transcripts with 100% fidelity
  • Reliability: Built-in retry logic, error handling, and automatic fallback
  • Scalability: Handle from single videos to large batch operations (100+)

This actor extracts publicly available transcript and metadata from YouTube. All extracted data is information that YouTube makes accessible through normal viewing.

Important: Please ensure your use complies with:

  • YouTube's Terms of Service
  • Applicable copyright laws
  • Data protection regulations (GDPR, etc.)
  • Your specific jurisdiction's laws

Why Choose This Extractor?

โœ… Comprehensive Data - Get both transcripts AND complete metadata in one run โœ… Production-Ready - Battle-tested reliability for professional applications โœ… Developer-Friendly - Clean JSON output, multiple export formats, easy integration โœ… Cost-Effective - Efficient processing minimizes compute costs โœ… Flexible Processing - Choose free or faster dedicated mode based on your needs โœ… Well-Maintained - Regular updates and responsive support โœ… Apify Platform - Leverage the full power of Apify's infrastructure


Getting Started

  1. Sign up for a free Apify account
  2. Configure input with your YouTube URLs
  3. Select options (transcript cleaning level, timestamps, etc.)
  4. Run the actor - It's that simple!
  5. Download results in your preferred format

No complex setup, no hidden fees, no surprises. Just reliable data extraction.


๐Ÿ’ฌ Support & Contact

Need help or have questions? We're here for you:



๐ŸŽฌ Video & Media Tools

๐ŸŽฌ Youtube Playlist Extractor Extract complete video transcripts with timestamps and comprehensive metadata. Perfect for content analysis, SEO, and subtitle generation.

YouTube Full Channel, Playlists, Shorts, Live Extract complete ๐ŸŽฌ playlist information with all video details from any YouTube playlist -->โšกFast, reliable, and built for scale. Get video lists, durations, thumbnails, and channel info.

Zoom Scraper | ๐ŸŽฅ Downloader & ๐Ÿ“„ Transcript Extract Zoom meeting recordings, transcripts, and metadata. Ideal for meeting analysis and documentation.

Loom Scraper | ๐ŸŽฅ Downloader & ๐Ÿ“„ Transcript Download Loom videos and extract transcripts. Perfect for training content and video documentation.

๐Ÿ  Real Estate Data

Idealista Scraper API Advanced Idealista property data extraction with API access. Get listings, prices, and detailed property information.

Idealista Scraper Extract Spanish real estate listings from Idealista. Perfect for market analysis and property research.

๐Ÿ› ๏ธ Developer & Security Tools

Screenshot Fast, reliable webpage screenshots with customizable options. Essential for monitoring and documentation.

Ultimate Screenshot Advanced screenshot tool with full-page capture, custom viewports, and quality controls.

Network Security Scanner Scan websites for security vulnerabilities and get comprehensive security reports.

๐Ÿ“ฑ Social Media Tools

Facebook Ads Scraper Pro Extract Facebook ads data for competitor analysis and market research. Track ad campaigns and strategies.