Youtube Video Transcript Super Scraper avatar
Youtube Video Transcript Super Scraper

Pricing

Pay per event

Go to Apify Store
Youtube Video Transcript Super Scraper

Youtube Video Transcript Super Scraper

Unlock the power of in-depth YouTube video analytics with our Video Transcript Super Scraper. Extract comprehensive metadata, engagement metrics, and full video transcripts in any language you want.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Muhammad Noman Riaz

Muhammad Noman Riaz

Maintained by Community

Actor stats

1

Bookmarked

3

Total users

2

Monthly active users

20 hours ago

Last modified

Share

Unlock the power of in-depth YouTube video analytics with our Video Transcript Super Scraper. Extract comprehensive metadata, engagement metrics, and full video transcripts in any language you want.

Overview

This actor allows you to scrape detailed information about YouTube videos, including title, description, view count, keywords, thumbnail URLs, and more. You can also download the full transcript of the video in any available language (original or auto-translated). It provides valuable insights for content creators, marketers, and researchers looking to analyze video performance, metadata, and spoken content.

Features

  • Detailed Video Information: Extract comprehensive metadata about each video.
  • Thumbnail URLs: Get links to various sizes of video thumbnails.
  • Channel Details: Retrieve information about the video's channel.
  • Engagement Metrics: Capture view counts, like counts, and other relevant statistics.
  • Keywords and Categories: Extract video tags and categories for better understanding of content.
  • Multilingual Transcripts: Download the full transcript in the original or any supported/translated language.
  • Customizable: Flexible configuration options to suit various scraping needs.
  • Proxy Support: Built-in proxy configuration to enhance scraping reliability and avoid blocks.
  • Scrape 35+ critical data points from any YouTube video, including:

🎯 Core Metrics

  • View counts (live/archived)
  • Engagement: Likes/Dislikes ratio, comment density
  • Monetization: Ad placements, sponsor segments
  • Performance: CTR %, retention graphs (when available)

πŸ“Š Engagement Analytics

  • Real-time subscriber deltas
  • Share tracking (platform-specific breakdown)
  • Heatmap-style interaction patterns
  • Audience demographics (age/gender distributions)

πŸ“œ Content Details

  • Multilingual Transcripts
    • Raw timestamped data
    • Clean text-only versions
    • 100+ language support (auto-translate enabled)
  • Enhanced metadata:
    • Chapters/markers analysis
    • Hashtag performance metrics
    • Cards/end screens inventory

πŸ›  Technical Features

  • Automatic resolution of:
    • All YouTube URL formats (including embeds/shorts)
    • Video ID extraction from complex parameters

How to Use

  1. Set Up: Ensure you have an Apify account and access to the Apify platform.
  2. Configure Input: Set the YouTube video URL(s) you want to scrape details for (see Input Configuration section).
  3. (Optional) Adjust additional parameters like concurrency, proxy settings, and transcript language.
  4. Run the Scraper: Execute the scraper on the Apify platform.
  5. Data Collection: The scraper will output detailed data about the specified YouTube video(s), including transcripts.

Input Configuration

Here's an example of how to set up the input for the YouTube Video Transcript Super Scraper:

{
"startUrls": [
{
"url": "https://www.youtube.com/watch?v=K07bw2bKI8U&t=985s&ab_channel=CohhCarnage"
}
],
"maxConcurrency": 10,
"minConcurrency": 1,
"maxRequestRetries": 10,
"proxy": {
"useApifyProxy": true
},
"includeTranscript": true,
"language": "en"
}

Input Fields Explanation

  • startUrls: Array of YouTube video URLs to scrape details from.
  • maxConcurrency: Maximum number of pages processed simultaneously (default: 10).
  • minConcurrency: Minimum number of pages processed simultaneously (default: 1).
  • maxRequestRetries: Number of retries for failed requests (default: 10).
  • proxy: Proxy configuration for enhanced scraping reliability.
  • includeTranscript: Whether to include the transcript of the video in the output.
  • language: Select the language for the transcript. If not available, will fall back to the default language.

Output Structure

The output data includes detailed information about each video, with transcripts appearing first. Here's a sample of the structure:

[
{
"transcript": [
{
"text": "How do you explain\nwhen things don't go as we assume?",
"startMs": 7490,
"endMs": 10690,
"startTimeText": "0:07"
},
{
"text": "Or better, how do you explain when others are able to achieve things\nthat seem to defy all of the assumptions?",
"startMs": 10723,
"endMs": 16290,
"startTimeText": "0:10"
}
],
"transcript_only_text": "How do you explain when things don't go as we assume? Or better, how do you explain when others are able to achieve things that seem to defy all of the assumptions? For example: Why is Apple so innovative?...",
"videoId": "qp0HIF3SfI4",
"title": "How Great Leaders Inspire Action | Simon Sinek | TED",
"lengthSeconds": "1115",
"channelId": "UCAuUUnT6oDeKwE6v1NGQxug",
"isOwnerViewing": false,
"shortDescription": "Visit http://TED.com to get our entire library of TED Talks...",
"isCrawlable": true,
"thumbnail": {
"thumbnails": [
{
"url": "https://i.ytimg.com/vi/qp0HIF3SfI4/hqdefault.jpg",
"width": 168,
"height": 94
}
]
},
"allowRatings": true,
"viewCount": "21503849",
"author": "TED",
"isPrivate": false,
"isUnpluggedCorpus": false,
"isLiveContent": false,
"microformat": { },
"fullPlayerResponse": { }
}
]

Output Fields

  • transcript (array): Complete timestamped transcript with text, start/end times in milliseconds, and formatted time strings
  • transcript_only_text (string): Full transcript as plain text without timestamps
  • videoId: Unique YouTube video identifier
  • title: Video title
  • lengthSeconds: Video duration in seconds
  • channelId: YouTube channel ID
  • isOwnerViewing: Whether the owner is viewing
  • shortDescription: Video description text
  • isCrawlable: Whether the video is crawlable
  • thumbnail: Object containing thumbnail URLs in various sizes
  • allowRatings: Whether ratings are enabled
  • viewCount: Total view count
  • author: Channel name
  • isPrivate: Whether the video is private
  • isUnpluggedCorpus: Unplugged corpus status
  • isLiveContent: Whether it's live content
  • microformat: Complete microformat metadata
  • fullPlayerResponse: Complete YouTube player response data

Our Scraper Collection

Explore our comprehensive suite of social media scrapers designed for data extraction and analysis.

πŸ”Ή Instagram Scrapers

πŸ”Ή TikTok Scraper

πŸ”Ή YouTube Scrapers

About Muhammad Noman Riaz

Developed by Muhammad Noman Riaz, a dedicated developer passionate about creating powerful web scraping solutions. My user base has been growing steadily, and I'm committed to releasing more robust scrapers every single week.

Explore my full collection of scrapers and stay tuned for new releases!

License

This project is licensed under the Apache-2.0 License.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Support

For support, please open an issue on GitHub or contact the maintainer.