Youtube Metadata Scraper (Transcripts Included πŸ˜‹) avatar
Youtube Metadata Scraper (Transcripts Included πŸ˜‹)

Pricing

$5.00/month + usage

Go to Store
Youtube Metadata Scraper (Transcripts Included πŸ˜‹)

Youtube Metadata Scraper (Transcripts Included πŸ˜‹)

Developed by

tolu.

tolu.

Maintained by Community

Introducing the most comprehensive and robust YouTube Metadata Scraper on Apify, built for videos and shorts. Get detailed metadata, including title, description, video length, tags, like count, view count, comment count, and even full transcripts.

5.0 (1)

Pricing

$5.00/month + usage

1

Total users

25

Monthly users

12

Runs succeeded

>99%

Last modified

5 days ago

Introducing the most comprehensive and robust Youtube Metadata web scraper on Apify. Get video details, channel details, engagement statistics, transcripts and more from Youtube videos and shorts via a single interface.

πŸ˜‹ Features πŸ˜‹

  • Extract complete video metadata including video title, description, thumbnail, channel details, engagement stats and full transcripts.
  • Seamlessly handle both videos and shorts URL through a single interface.
  • Easily customize transcript options in the input settings including transcript formats and languages.
  • Access detailed timestamps, including start times and time ranges for each transcript segment.

πŸ‘©β€πŸ³ Input Parameters πŸ‘©β€πŸ³

ParameterTypeDescriptionDefault Value
startURLsarrayAt least one Youtube URL (Video and Shorts URL are supported). The maximum number of URLs is 1000.-
extractTranscriptbooleanIf selected, includes transcript in the resulttrue
transcriptFormatraw|timestampIf raw, return transcript as a string without timestamp information. If timestamp, include timestamp information for each transcript segmenttimestamp
includeEnglishAGbooleanIf selected, includes English (auto-generated) option in transcriptfalse
includeNonEnglishbooleanIf selected, includes non-English languages in transcriptsfalse
proxyobjectApify's proxy configuration. Choose RESIDENTIAL proxies for reliable runsRESIDENTIAL

πŸ– Output Example πŸ–

includeEnglishAG and includeNonEnglish was set to False while transcriptFormat was set to raw:

{
"id": "4KbrxIpQgkM",
"url": "https://www.youtube.com/watch?v=4KbrxIpQgkM",
"title": "Nothing Phone 3 Review: They Lied!",
"description": "So you want to be a flagship?\n\nNothing Phone 3: https://geni.us/Lv2Fcd\n\nMKBHD Merch: http://shop.MKBHD.com\n\nPlaylist of MKBHD Intro music: https://goo.gl/B3AWV5\n\nPhone provided by Nothing for review.\n\n~\nhttp://twitter.com/MKBHD\nhttp://instagram.com/MKBHD\nhttp://facebook.com/MKBHD",
"lengthInSeconds": 938,
"uploadDatetime": "2025-07-08T13:00:59+00:00",
"category": "Science & Technology",
"tags": [
"Nothing Phone",
"Nothing Phone 3",
"Nothing Phone 3 review",
"MKBHD",
"Nothing phone MKBHD",
"Nothing Phone 3 review MKBHD",
"Nothing 3",
"Nothing 3 vs",
"Nothing phone 3 vs",
"flagship",
"true flagship",
"glyph",
"glyph matrix"
],
"thumbnail": "https://i.ytimg.com/vi/4KbrxIpQgkM/hqdefault.jpg",
"channelID": "UCBJycsmduvYEL83R_U4JriQ",
"channelURL": "http://www.youtube.com/@mkbhd",
"channelUsername": "mkbhd",
"channelDisplayName": "Marques Brownlee",
"channelSubscribers": "20.1M",
"viewCount": 2493676,
"likeCount": 81093,
"commentCount": 6785,
"transcripts": [
{
"language": "English",
"content": "(funky upbeat music) - This is the Nothing Phone 3 and yeah, it's ugly. (funky music) You know, people say beauty is in the eye of the beholder and stuff like that. I get it. You can find\nbeauty in anything, for sure. But also, you know,\nthere's a nice alignment to things that feels good. And you know, when things are, when things are where\nyou expect them to be, doesn't that feel better? Okay. All right, carcinisation. I learned about this recently...
}
]
}

Also, here is an excerpt of the transcript when transcriptFormat is set to timestamp:

{
...,
"transcripts": [
{
"language": "English",
"content": [
{
"startMs": 291,
"endMs": 3374,
"startTime": "0:00",
"text": "(funky upbeat music)"
},
{
"startMs": 18600,
"endMs": 21390,
"startTime": "0:18",
"text": "- This is the Nothing Phone 3"
},
{
"startMs": 21390,
"endMs": 23974,
"startTime": "0:21",
"text": "and yeah, it's ugly."
},
...,
]
}
]
}

πŸ’ͺ Good to Know πŸ’ͺ

  • If 10 URLs fail consecutively during processing, the Actor run will terminate automatically, and the remaining URLs will not be processed.
  • The run state is saved during ABORT (graceful) and MIGRATION events, so it’s safe to resurrect the run afterward. However, this is not the case for TIMEOUT events (see Limitations section).
    It's possible that a few URLs may be processed after the state has been saved but before the event actually occurs. These URLs will be re-processed when the run is resurrected.
  • You can find a run summary (RUN_SUMMARY) in the key-value store after the run completes. The URLs are divided into 4 categories:
    • succeeded - the URL was successfully processed, and the result is available in the dataset.
    • failed - the URL failed during processing.
    • unavailable - the URL was processed but the video page is unavailable (e.g. the video is private or no longer exists).
    • unprocessed - the URL was not processed because the run exited early after too many consecutive failures.

🚫 Limitations 🚫

  • The actor will most likely fail if RESIDENTIAL proxies are not used.
  • Apify does not provide built-in support for saving state in he event of a timeout. However, the default 3-hour limit should be sufficient for any run. If a timeout does occur, do not attempt to resurrect the run, as it will restart from the beginning. Instead, start a new run with the URLs that were not processed.
    The ABORT (graceful) and MIGRATION events are handled properly, so its safe to resurrect the run after these events.

This scraper is under active development and suggestions or feature requests will be greatly appreciated. If you have suggestions, feature requests, or encounter any issues, feel free to: