YouTube Scraper avatar

YouTube Scraper

Pricing

from $0.50 / 1,000 video (listing only)s

Go to Apify Store
YouTube Scraper

YouTube Scraper

⚡ Every YouTube field in one Actor — videos, channels, playlists, search, Shorts, comments, subtitles, hashtags — at 200+ videos/sec. Chrome TLS fingerprint, rotating residential IPs, full channel metadata on every row. Zero blocks, zero CAPTCHAs.

Pricing

from $0.50 / 1,000 video (listing only)s

Rating

0.0

(0)

Developer

VortexData

VortexData

Maintained by Community

Actor stats

1

Bookmarked

5

Total users

2

Monthly active users

21 hours ago

Last modified

Share

🎬 YouTube Data Collector

Paste a YouTube video, channel, playlist, search URL, hashtag, video ID, channel ID, playlist ID, @handle, or keyword. Choose the result package you want and a size. The Actor detects every input automatically and separates the output into clean datasets.

🚀 Quick Start

  1. Paste one or more items into 📥 Paste what you have.
  2. Keep 🎯 Choose result package on 🧲 Complete if you simply want everything from each input.
  3. Pick 📦 Choose size.
  4. Run the Actor.

Most users only need those three fields. You never need to choose video, channel, playlist, or search manually: source detection is automatic.

🎯 Result Packages

PackageUse whenCollects
🧲 CompleteYou want the full dataset from whatever you pastedVideos, channels, playlists, Shorts, streams, posts, transcripts, comments, replies, contacts, insights, media, diagnostics
⚡ QuickYou want a fast clean table firstLightweight metadata only, without comments, transcripts, media expansion, or contact crawling
🧠 ResearchYou compare topics, keywords, and engagementSearch across videos, channels, and playlists with content insights and enrichment
💬 CommentsYou monitor audience feedbackFlat comments and replies, newest/top sorting, continuation tokens, dedupe across scheduled runs
📧 LeadsYou build creator prospect listsPublic emails, websites, social links, contact links, external-page enrichment, and validation
📝 TranscriptsYou build AI/RAG datasetsTranscript text, timestamped segments, language fallback, match terms, and compact transcript output
🎞️ MediaYou audit technical/video extrasMedia formats, chapters, SponsorBlock segments, related videos, live chat/replay when public

📥 Accepted Inputs

You can mix input types in the same run.

InputExample
Video URLhttps://www.youtube.com/watch?v=wwSzpaTHyS8
Shorts URLhttps://www.youtube.com/shorts/...
Video IDdQw4w9WgXcQ
Channel handle@NASA
Channel IDUC...
Playlist URLhttps://www.youtube.com/playlist?list=...
Search URLhttps://www.youtube.com/results?search_query=...
Hashtaghttps://www.youtube.com/hashtag/apify
Keywordweb scraping tutorial

📦 Size Presets

SizeBest forDefault limits
testFast validationTiny sample, minimal comments and replies
smallFirst real runRecommended first production run
mediumResearch datasetMore items, comments, replies, and enrichment
largeBig exportLarge but capped export
unlimitedMaximum collectionNo global result-row cap

Advanced API users can still override exact limits with hidden fields such as maxResults, maxItems, maxComments, and maxRepliesPerComment.

📤 Output

The Actor writes separate datasets so the Apify Output tab stays readable. The first output is Videos, not a mixed table, so channel, playlist, post, comment, and diagnostic rows do not appear as odd rows inside a video table.

DatasetContains
VideosVideo, Shorts, stream, and movie rows only
ChannelsChannel profiles from channel inputs and channel search results
Playlists / searchPlaylist, show, and other non-video search entity rows
PostsCommunity post rows from channel Community/Posts tabs
TranscriptsRows with transcript text, subtitle status, or transcript keyword matches
ContactsRows with public emails, websites, socials, or contact links
MediaRows with media formats, SponsorBlock, chapters, live chat, or related videos
CommentsOne flat row per comment or reply
DiagnosticsInvalid inputs, unavailable videos, empty sources, and source failures
All recordsCompatibility dataset with every non-comment, non-diagnostic main row

The same clean row can appear in more than one focused dataset. For example, a video with transcript and media data appears in Videos, Transcripts, and Media; it is still one billed main result.

🧾 Canonical Fields

MeaningField
Video/channel/search URLurl
Source page that produced the itemsourceUrl
Viewsviews
Likeslikes
Subscriberssubscribers
Video descriptiondescription
Comment texttext in the Comments dataset
Transcript texttranscriptText
Creator emailsemails

Legacy raw aliases can still be returned with the hidden includeRawFields API option.

🧩 Example Inputs

Complete mixed run

{
"targets": [
"https://www.youtube.com/watch?v=wwSzpaTHyS8",
"@NASA",
"web scraping tutorial"
],
"scenario": "complete",
"runSize": "test"
}

Everything from a video

{
"targets": ["https://www.youtube.com/watch?v=wwSzpaTHyS8"],
"scenario": "complete",
"runSize": "small"
}

Everything from a channel

{
"targets": ["@NASA"],
"scenario": "complete",
"runSize": "small"
}

Fast metadata only

{
"targets": ["@NASA", "web scraping tutorial"],
"scenario": "quick_metadata",
"runSize": "small"
}

Research keywords and engagement

{
"targets": ["web scraping tutorial"],
"scenario": "research_insights",
"runSize": "small"
}

Comment monitoring

{
"targets": ["https://www.youtube.com/watch?v=wwSzpaTHyS8"],
"scenario": "comments_monitoring",
"runSize": "small",
"stateKey": "my-video-comments"
}

AI transcripts

{
"targets": ["https://www.youtube.com/watch?v=wwSzpaTHyS8"],
"scenario": "transcripts_for_ai",
"runSize": "small",
"subtitlesLanguages": ["en", "any"],
"subtitlesFormat": "plaintext"
}

Creator leads

{
"targets": ["@NASA", "ai automation agency"],
"scenario": "creator_leads",
"runSize": "small",
"emailValidation": "syntax"
}

Media audit

{
"targets": ["https://www.youtube.com/watch?v=wwSzpaTHyS8"],
"scenario": "media_audit",
"runSize": "small"
}

🧠 Feature Coverage

FeatureStatus
Video metadataIncluded
Channel about dataIncluded
Channel profile outputIncluded
PlaylistsIncluded
Search resultsIncluded
ShortsIncluded
StreamsIncluded
Community postsIncluded
Comments and repliesIncluded
Continuation tokensIncluded
Dedupe across scheduled runsIncluded
Transcripts and captionsIncluded
Language fallbackIncluded
Transcript searchIncluded
SponsorBlock segmentsIncluded
Related videosIncluded
ChaptersIncluded
Most-replayed heatmapIncluded
Music creditsIncluded
Creator contactsIncluded
Content insightsIncluded
Live chat and replay chatBest effort when YouTube exposes public continuation
Media format summariesIncluded
Entity-separated output datasetsIncluded
Large payload KVS fallbackIncluded
Diagnostics datasetIncluded

🛠️ API Compatibility

The old jobPreset field and legacy scenario values such as auto, video_full, channel_full, playlist_full, and search_full are still accepted for existing integrations. New Console runs should use the visible result packages in scenario.

Legacy inputs such as startUrls, searchQueries, youtubeHandles, videoIds, channelIds, playlistIds, and extraData still work.

✅ Reliability

The Actor uses Chrome-like TLS requests through curl_cffi, Apify Residential Proxy on Apify Cloud, per-request proxy sessions, retry handling for transient YouTube responses, KVS fallback for large payloads, and separate diagnostics output for invalid or unavailable sources.

💾 Export

Download Videos, Channels, Playlists / search, Posts, Transcripts, Contacts, Media, Comments, Diagnostics, or All records from Apify in JSON, CSV, Excel, XML, RSS, or HTML, or consume them through Dataset API endpoints.