YouTube Scraper
Pricing
from $0.50 / 1,000 video (listing only)s
YouTube Scraper
⚡ Every YouTube field in one Actor — videos, channels, playlists, search, Shorts, comments, subtitles, hashtags — at 200+ videos/sec. Chrome TLS fingerprint, rotating residential IPs, full channel metadata on every row. Zero blocks, zero CAPTCHAs.
Pricing
from $0.50 / 1,000 video (listing only)s
Rating
0.0
(0)
Developer
VortexData
Maintained by CommunityActor stats
1
Bookmarked
5
Total users
2
Monthly active users
a day ago
Last modified
Categories
Share
🎬 YouTube Data Collector
Paste a YouTube video, channel, playlist, search URL, hashtag, video ID, channel ID, playlist ID, @handle, or keyword. Choose the result package you want and a size. The Actor detects every input automatically and separates the output into clean datasets.
🚀 Quick Start
- Paste one or more items into 📥 Paste what you have.
- Keep 🎯 Choose result package on 🧲 Complete if you simply want everything from each input.
- Pick 📦 Choose size.
- Run the Actor.
Most users only need those three fields. You never need to choose video, channel, playlist, or search manually: source detection is automatic.
🎯 Result Packages
| Package | Use when | Collects |
|---|---|---|
| 🧲 Complete | You want the full dataset from whatever you pasted | Videos, channels, playlists, Shorts, streams, posts, transcripts, comments, replies, contacts, insights, media, diagnostics |
| ⚡ Quick | You want a fast clean table first | Lightweight metadata only, without comments, transcripts, media expansion, or contact crawling |
| 🧠 Research | You compare topics, keywords, and engagement | Search across videos, channels, and playlists with content insights and enrichment |
| 💬 Comments | You monitor audience feedback | Flat comments and replies, newest/top sorting, continuation tokens, dedupe across scheduled runs |
| 📧 Leads | You build creator prospect lists | Public emails, websites, social links, contact links, external-page enrichment, and validation |
| 📝 Transcripts | You build AI/RAG datasets | Transcript text, timestamped segments, language fallback, match terms, and compact transcript output |
| 🎞️ Media | You audit technical/video extras | Media formats, chapters, SponsorBlock segments, related videos, live chat/replay when public |
📥 Accepted Inputs
You can mix input types in the same run.
| Input | Example |
|---|---|
| Video URL | https://www.youtube.com/watch?v=wwSzpaTHyS8 |
| Shorts URL | https://www.youtube.com/shorts/... |
| Video ID | dQw4w9WgXcQ |
| Channel handle | @NASA |
| Channel ID | UC... |
| Playlist URL | https://www.youtube.com/playlist?list=... |
| Search URL | https://www.youtube.com/results?search_query=... |
| Hashtag | https://www.youtube.com/hashtag/apify |
| Keyword | web scraping tutorial |
📦 Size Presets
| Size | Best for | Default limits |
|---|---|---|
test | Fast validation | Tiny sample, minimal comments and replies |
small | First real run | Recommended first production run |
medium | Research dataset | More items, comments, replies, and enrichment |
large | Big export | Large but capped export |
unlimited | Maximum collection | No global result-row cap |
Advanced API users can still override exact limits with hidden fields such as maxResults, maxItems, maxComments, and maxRepliesPerComment.
📤 Output
The Actor writes separate datasets so the Apify Output tab stays readable. The first output is Videos, not a mixed table, so channel, playlist, post, comment, and diagnostic rows do not appear as odd rows inside a video table.
| Dataset | Contains |
|---|---|
| Videos | Video, Shorts, stream, and movie rows only |
| Channels | Channel profiles from channel inputs and channel search results |
| Playlists / search | Playlist, show, and other non-video search entity rows |
| Posts | Community post rows from channel Community/Posts tabs |
| Transcripts | Rows with transcript text, subtitle status, or transcript keyword matches |
| Contacts | Rows with public emails, websites, socials, or contact links |
| Media | Rows with media formats, SponsorBlock, chapters, live chat, or related videos |
| Comments | One flat row per comment or reply |
| Diagnostics | Invalid inputs, unavailable videos, empty sources, and source failures |
| All records | Compatibility dataset with every non-comment, non-diagnostic main row |
The same clean row can appear in more than one focused dataset. For example, a video with transcript and media data appears in Videos, Transcripts, and Media; it is still one billed main result.
🧾 Canonical Fields
| Meaning | Field |
|---|---|
| Video/channel/search URL | url |
| Source page that produced the item | sourceUrl |
| Views | views |
| Likes | likes |
| Subscribers | subscribers |
| Video description | description |
| Comment text | text in the Comments dataset |
| Transcript text | transcriptText |
| Creator emails | emails |
Legacy raw aliases can still be returned with the hidden includeRawFields API option.
🧩 Example Inputs
Complete mixed run
{"targets": ["https://www.youtube.com/watch?v=wwSzpaTHyS8","@NASA","web scraping tutorial"],"scenario": "complete","runSize": "test"}
Everything from a video
{"targets": ["https://www.youtube.com/watch?v=wwSzpaTHyS8"],"scenario": "complete","runSize": "small"}
Everything from a channel
{"targets": ["@NASA"],"scenario": "complete","runSize": "small"}
Fast metadata only
{"targets": ["@NASA", "web scraping tutorial"],"scenario": "quick_metadata","runSize": "small"}
Research keywords and engagement
{"targets": ["web scraping tutorial"],"scenario": "research_insights","runSize": "small"}
Comment monitoring
{"targets": ["https://www.youtube.com/watch?v=wwSzpaTHyS8"],"scenario": "comments_monitoring","runSize": "small","stateKey": "my-video-comments"}
AI transcripts
{"targets": ["https://www.youtube.com/watch?v=wwSzpaTHyS8"],"scenario": "transcripts_for_ai","runSize": "small","subtitlesLanguages": ["en", "any"],"subtitlesFormat": "plaintext"}
Creator leads
{"targets": ["@NASA", "ai automation agency"],"scenario": "creator_leads","runSize": "small","emailValidation": "syntax"}
Media audit
{"targets": ["https://www.youtube.com/watch?v=wwSzpaTHyS8"],"scenario": "media_audit","runSize": "small"}
🧠 Feature Coverage
| Feature | Status |
|---|---|
| Video metadata | Included |
| Channel about data | Included |
| Channel profile output | Included |
| Playlists | Included |
| Search results | Included |
| Shorts | Included |
| Streams | Included |
| Community posts | Included |
| Comments and replies | Included |
| Continuation tokens | Included |
| Dedupe across scheduled runs | Included |
| Transcripts and captions | Included |
| Language fallback | Included |
| Transcript search | Included |
| SponsorBlock segments | Included |
| Related videos | Included |
| Chapters | Included |
| Most-replayed heatmap | Included |
| Music credits | Included |
| Creator contacts | Included |
| Content insights | Included |
| Live chat and replay chat | Best effort when YouTube exposes public continuation |
| Media format summaries | Included |
| Entity-separated output datasets | Included |
| Large payload KVS fallback | Included |
| Diagnostics dataset | Included |
🛠️ API Compatibility
The old jobPreset field and legacy scenario values such as auto, video_full, channel_full, playlist_full, and search_full are still accepted for existing integrations. New Console runs should use the visible result packages in scenario.
Legacy inputs such as startUrls, searchQueries, youtubeHandles, videoIds, channelIds, playlistIds, and extraData still work.
✅ Reliability
The Actor uses Chrome-like TLS requests through curl_cffi, Apify Residential Proxy on Apify Cloud, per-request proxy sessions, retry handling for transient YouTube responses, KVS fallback for large payloads, and separate diagnostics output for invalid or unavailable sources.
💾 Export
Download Videos, Channels, Playlists / search, Posts, Transcripts, Contacts, Media, Comments, Diagnostics, or All records from Apify in JSON, CSV, Excel, XML, RSS, or HTML, or consume them through Dataset API endpoints.