YouTube Video Transcript avatar
YouTube Video Transcript

Pricing

$5.00 / 1,000 results

Go to Apify Store
YouTube Video Transcript

YouTube Video Transcript

Developed by

starvibe

starvibe

Maintained by Community

This Apify Actor extracts full transcripts (with timestamps) and metadata from YouTube videos, including title, description, upload date, views, likes, channel info, and duration

5.0 (9)

Pricing

$5.00 / 1,000 results

8

24

24

Last modified

a day ago

YouTube Video Transcript and Metadata Extractor

Overview

Unlock the power of YouTube video data with this Apify Actor! Effortlessly fetch transcript/subtitles and rich metadata (title, channel, view count, and more) for any YouTube video. Perfect for researchers, content creators, and analysts who need structured video data without the hassle of direct scraping. Save time and get reliable, structured data straight to your Apify dataset!

Input

Provide a YouTube video URL. The Actor requires the following fields, defined in input_schema.json:

FieldTypeDescriptionRequired
youtube_urlStringThe URL of the YouTube (e.g., https://www.youtube.com/watch?v=dQw4w9WgXcQ).Yes
languageStringThe Transcript language (e.g., en).No

Example Input:

{
"youtube_url": "https://www.youtube.com/watch?v=6EeDKyS7pV8"
}

Output

The Actor saves data to the Apify dataset in a structured JSON format, as defined in dataset_schema.json. The output includes:

FieldTypeDescription
video_idStringThe unique identifier of the YouTube video.
titleStringThe title of the YouTube video.
channel_nameStringThe name of the YouTube channel.
channel_idStringThe unique identifier of the YouTube channel.
timestampIntegerThe Unix timestamp of when the video was published.
published_atStringThe UTC date and time of video upload (format: YYYY-MM-DD HH:MM:SSZ).
view_countIntegerThe number of views the video has received.
transcriptArrayAn array of transcript segments, each with text, start, duration, language, and language_code.
urlStringThe URL of the YouTube video.
languageStringThe primary language of the transcript (e.g., en for English).
duration_secondsIntegerThe total duration of the video in seconds.

Sample Output (TED Talk Video in English):

[
{
"channel_id": "UCAuUUnT6oDeKwE6v1NGQxug",
"channel_name": "TED",
"comment_count": 10953,
"duration_seconds": 1159,
"language": "en",
"like_count": 388218,
"timestamp": 1559575676,
"title": "Sleep Is Your Superpower | Matt Walker | TED",
"transcript": [
{
"duration": 1.5,
"language": "English (auto-generated)",
"language_code": "en",
"start": 0.83,
"text": "Thank you very much."
},
{
"duration": 4,
"language": "English (auto-generated)",
"language_code": "en",
"start": 2.37,
"text": "Well, I would like\nto start with testicles."
},
...
{
"duration": 1.17,
"language": "English (auto-generated)",
"language_code": "en",
"start": 1141.58,
"text": "Great job, Matt."
},
{
"duration": 2.21,
"language": "English (auto-generated)",
"language_code": "en",
"start": 1142.75,
"text": "MW: You're very welcome.\nThank you very much."
}
],
"published_at": "2019-06-03T15:27:56Z",
"url": "https://www.youtube.com/watch?v=5MuIMqhT8DM",
"video_id": "5MuIMqhT8DM",
"view_count": 15062630,
"geo_restrict": null,
"status": "success",
"message": "Successfully fetched the transcript for the video with ID '5MuIMqhT8DM'"
}
]

Why Use This Actor?

  • Efficiency: Get YouTube video data in seconds via your internal API.
  • Flexibility: Works with any API endpoint that returns structured JSON data.
  • Reliability: Built-in validation and logging ensure robust performance.
  • Scalability: Easily integrate with Apify's ecosystem for data processing and export.