Youtube Scraper Pro avatar
Youtube Scraper Pro

Pricing

Pay per event

Go to Apify Store
Youtube Scraper Pro

Youtube Scraper Pro

Developed by

Delowar Munna

Delowar Munna

Maintained by Community

Full‑fidelity YouTube data extractor for videos, Shorts, live streams, and channels. Built on Apify + Puppeteer for reliable, scalable web scraping that returns complete metadata, engagement stats, caption tracks for transcripts, hashtags/keywords, description links, and channel insights — fast.

0.0 (0)

Pricing

Pay per event

0

5

2

Last modified

5 days ago

YouTube Scraper Pro (Apify Actor)

Full‑fidelity YouTube data extractor for videos, Shorts, live streams, and channels. Built with YouTube Data API v3 + Puppeteer for reliable, scalable hybrid extraction that returns complete metadata, engagement stats, hashtags/keywords, description links, and channel insights — fast and efficient.

YouTube Scraper Pro banner


At a Glance

  • Hybrid Architecture: YouTube Data API v3 + web scraping for 100% field coverage
  • Scrape by keywords, channel handles (e.g. @mkbhd), channel IDs, playlists, or direct video URLs
  • Returns: title, description, duration, publish date, views, likes, comments count, category, features (HD/4K/HDR), hashtags, keywords, description links, and rich channel data
  • Optimized: API for core fields + selective scraping for advanced fields → ~4–6s/video without proxy, all fields populated
  • Localization: region + language options
  • Optional Proxy: Residential proxy available for enhanced reliability

Best for: content research, SEO, competitive intelligence, trend analysis, brand monitoring, influencer discovery, and data‑driven content strategy.


🚀 New: Hybrid API + Scraping Architecture

Version 2.6 introduces a revolutionary hybrid approach:

MethodFields CoveredSpeed
YouTube Data API v316 core fields (title, views, likes, duration, etc.)Instant
Web Scraping11 advanced fields (hashtags, links, monetization, etc.)~4-6s/video
Residential Proxy (optional)Enhanced reliability~10-15s/video

Benefits:

  • 100% field coverage - All 27 fields populated
  • No proxy required - Works reliably without residential proxy
  • Dual fallback system - API → Scraping → API Description extraction
  • Production ready - Proven reliability with complete data extraction

Why this scraper?

  • Hybrid extraction: YouTube Data API v3 for core data + web scraping for advanced fields = best of both worlds
  • Complete fields: visits each video page to populate all available metadata (not just search snippets)
  • Scalable & reliable: automatic multi-project rotation for consistent performance
  • Triple fallback: API → Browser scraping → API description parsing ensures maximum field population
  • Stable & fast: ~4-6 seconds per video with 100% field coverage
  • Consistent schema: predictable JSON keys designed for analytics pipelines
  • Enterprise‑ready: optional residential proxies, localization, rate limiting, and error handling

Key Features

  • Hybrid Data Extraction: YouTube Data API v3 (16 fields) + Web Scraping (11 advanced fields)
  • Intelligent API Management: Automatic multi-project rotation for optimal performance
  • Multiple Input Methods: keywords, channel handles/IDs, playlists, direct URLs, bulk URL upload via text file or remote file link
  • Content Coverage: standard videos, Shorts, live/live‑replay
  • Date Filtering: filter search results by publish date range (applies to search keywords only)
  • Comprehensive Metadata: title, description, duration, publish date, views, likes, comment count, category
  • Rich Media & SEO Signals: hashtags, keywords/tags, description outbound links
  • Channel Intelligence: id, name, handle, URL, subscribers, (optionally) totals and profile data
  • Performance: 3 concurrent page visits; resource/ads/fonts/video blocking for speed
  • Localization: regionCode + language
  • Optional Proxy: Residential proxy available for enhanced reliability (not required)

Quick Start (Apify)

  1. Create an Actor task and paste one of the JSON inputs below
  2. (Optional) Enable residential proxy for maximum reliability - controlled via code setting
  3. Run. Export JSON/CSV/Excel to your datastore, Google Sheets, or S3

Input Schema

The input form is organized into collapsible sections for better usability:

  • Search Settings: Configure search behavior, localization, and result limits
  • Direct URLs: Scrape specific YouTube URLs directly

Input Parameters

#FieldKeyTypeRequiredDefaultDescription
1Search KeywordssearchQueriesArrayNo[]Search for keywords, video topics, or channels. Accepts channel handles (@name) or channel IDs (UC...)
2Include ShortsincludeShortsbooleanNofalseIf true, include Shorts in search/results
3Max videos per search termmaxResultsPerQueryintegerNo10Applies per search keyword and per list source. Min 1
4CountryregionCodestringNo"US"ISO-3166-1 alpha-2 code. Options: US, GB, CA, AU, IN, DE, FR, JP, BR, MX
5LanguagelanguagestringNo"en"IETF BCP-47 code. Options: en, es, de, fr, pt, ja, hi, zh
6From DatedateFromstringNo""Filter videos published after this date (YYYY-MM-DD). Only applies to Search Keywords
7To DatedateTostringNo""Filter videos published before this date (YYYY-MM-DD). Only applies to Search Keywords
8YouTube URLsstartUrlsArray<Object|string>No[]Accepts video/shorts/channel/playlist/search URLs. Supports manual entry, text file upload, or remote file link for bulk processing

Important Notes:

  • Date Filtering: Date filters (dateFrom and dateTo) only apply to Search Keywords. Direct URLs are not filtered by date.
  • Bulk URL Upload: The YouTube URLs field supports uploading a text file (one URL per line) or linking to a remote text file for batch processing.
  • Residential Proxy: Configured via code (not user input) for testing purposes. Disabled by default for production.

Tip: Start with smaller maxResultsPerQuery to validate your setup, then scale up.

Input Example - 01: Search with date filtering

{
"searchQueries": ["AI tools", "machine learning", "@veritasium"],
"maxResultsPerQuery": 20,
"includeShorts": false,
"dateFrom": "2025-01-01",
"dateTo": "2025-12-31",
"regionCode": "US",
"language": "en"
}

Input Example - 02: Direct video URLs

{
"searchQueries": [],
"startUrls": [
{"url": "https://www.youtube.com/watch?v=7Sx0o-41r2k"},
{"url": "https://www.youtube.com/watch?v=5oAnKSCP4do"},
{"url": "https://www.youtube.com/watch?v=QJBP2uy8LcU"},
{"url": "https://www.youtube.com/watch?v=DOtJEwVsJic"}
],
"includeShorts": true,
"maxResultsPerQuery": 10,
"regionCode": "US",
"language": "en"
}

Input Example - 03: Bulk URL upload from remote file

{
"searchQueries": [],
"startUrls": [
{
"requestsFromUrl": "https://raw.githubusercontent.com/coregentdevspace/youtube-scraper-assets/main/youtube-scraper-pro-direct-url-text-file.txt"
}
],
"dateFrom": "2025-10-01",
"dateTo": "2025-10-31",
"includeShorts": true,
"maxResultsPerQuery": 5,
"regionCode": "US",
"language": "en"
}

Output Schema

Core Output Fields

#FieldTypeDescription
1typeStringOne of video, shorts, live, stream
2VideoIdStringYouTube video ID (e.g., 7Sx0o-41r2k)
3PageURLStringCanonical YouTube watch URL
4titleStringVideo title
5thumbnailUrlString | nullPrimary/hero thumbnail URL
6publishDateString (ISO) | nullWhen the video was published
7durationString | nullHH:MM:SS format (e.g., 00:22:43)
8durationSecondsInteger | nullDuration in seconds
9viewCountInteger | nullTotal views
10likeCountInteger | nullTotal likes
11commentCountInteger | nullPublic comments count
12categoryString | nullVideo category

Video Properties

#FieldTypeDescription
13isLiveBooleanWhether the item is/was a live stream
14isMembersOnlyBooleanMembers-only gated flag
15isPrivateBooleanPrivate/unavailable to public
16isFamilySafeBoolean | nullFamily-safe flag if exposed
17isMonetizedBoolean | nullMonetization detectable flag
18isRatingsAllowedBoolean | nullWhether likes/dislikes are enabled
19commentsTurnedOffBoolean | nullWhether comments are disabled
20commentsAllowedBoolean | nullConvenience mirror

Content & Metadata

#FieldTypeDescription
21descriptionString | nullCreator-written description
22descriptionLinksArrayLinks parsed from description
23keywordsArrayTags/keywords
24hashtagsArrayHashtags from title/description

Media Features

#FieldTypeDescription
25featuresObjectMedia feature flags
-features.isHDBoolean | nullHD available
-features.is4KBoolean | null4K available
-features.isHDRBoolean | nullHDR available
-features.isVR180Boolean | nullVR180 available
-features.is360Boolean | null360° available

Channel Information

#FieldTypeDescription
26channelObjectChannel metadata
-channel.idString | nullChannel ID
-channel.nameString | nullDisplay name
-channel.handleString | null@handle
-channel.urlString | nullChannel URL
-channel.subscriberCountInteger | String | nullSubscriber count
-channel.totalViewsInteger | String | nullLifetime channel views
-channel.totalVideosInteger | nullNumber of uploads
-channel.countryString | nullChannel country/region
-channel.profileImageString | nullAvatar URL
-channel.descriptionString | nullAbout text
-channel.linksObject<String,String>Social/website links map

Provenance

#FieldTypeDescription
27provenanceObjectSource/ordering metadata
-provenance.orderInteger | nullPosition in results
-provenance.sourceString | nullsearch | channel | playlist | trending | startUrl
-provenance.collectedAtString (ISO)Timestamp when collected

Sample Output

Output Example - Overview Fields (Key fields only)

[
{
"type": "live",
"PageURL": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"title": "Rick Astley - Never Gonna Give You Up (Official Video) (4K Remaster)",
"viewCount": 1707551805,
"likeCount": 18606140,
"duration": "00:03:33",
"publishDate": "2009-10-24",
"channel": {
"id": "UCuAXFkgsw1L7xaCfnd5JJOw",
"name": "Rick Astley",
"handle": null,
"url": "https://www.youtube.com/channel/UCuAXFkgsw1L7xaCfnd5JJOw",
"subscriberCount": "4.42M subscribers"
}
}
]

Output Example - Complete Record (All 27 fields)

[
{
"type": "live",
"VideoId": "dQw4w9WgXcQ",
"PageURL": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"title": "Rick Astley - Never Gonna Give You Up (Official Video) (4K Remaster)",
"thumbnailUrl": "https://i.ytimg.com/vi/dQw4w9WgXcQ/maxresdefault.jpg",
"publishDate": "2009-10-24",
"duration": "00:03:33",
"durationSeconds": 213,
"viewCount": 1707551805,
"likeCount": 18606140,
"commentCount": 2406204,
"category": "Music",
"isLive": false,
"isMembersOnly": false,
"isPrivate": false,
"isFamilySafe": true,
"isMonetized": true,
"isRatingsAllowed": true,
"commentsAllowed": true,
"description": "The official video for "Never Gonna Give You Up" by Rick Astley...",
"descriptionLinks": [
{
"url": "https://linktr.ee/rickastleynever",
"text": "https://linktr.ee/rickastleynever"
},
{
"url": "https://RickAstley.lnk.to/YTSubID",
"text": "https://RickAstley.lnk.to/YTSubID"
}
],
"keywords": [
"rick astley",
"Never Gonna Give You Up",
"rick rolled"
],
"hashtags": [
"RickAstleyNever",
"RickAstley",
"NeverGonnaGiveYouUp"
],
"features": {
"isHD": true,
"is4K": false,
"isHDR": false,
"isVR180": null,
"is360": null
},
"channel": {
"id": "UCuAXFkgsw1L7xaCfnd5JJOw",
"name": "Rick Astley",
"handle": "@RickAstleyYT",
"url": "https://www.youtube.com/channel/UCuAXFkgsw1L7xaCfnd5JJOw",
"subscriberCount": "4.42M subscribers"
},
"provenance": {
"collectedAt": "2025-10-30T12:00:00.000Z"
}
}
]

Performance & Reliability

Without Residential Proxy (Default)

  • Speed: ~4–6 seconds per video with all fields populated
  • Parallelism: up to 3 concurrent video page visits
  • Throughput: ~10-15 videos/minute
  • Field Coverage: 100% (all 27 fields populated via hybrid API + scraping)

With Residential Proxy (Optional)

  • Speed: ~10–15 seconds per video
  • Reliability: Maximum (bypasses all detection)
  • Field Coverage: 100% (all fields guaranteed)

Architecture Benefits

  • YouTube Data API v3: Instant core data (duration, views, likes, category)
  • Smart Web Scraping: Advanced fields (hashtags, description links, monetization)
  • Dual Fallback System:
    1. Browser scraping from ytInitialPlayerResponse
    2. API description extraction for links/hashtags
  • Stability: smart retries, exponential backoff, resource blocking (video streams, ads, fonts)

For large jobs, prefer batching by topic/channel and consider residential proxies for maximum reliability.


  • SEO research: surface keywords, tags, hashtags, linking practices
  • Content strategy: analyze formats, titles, thumbnails, and posting cadence
  • Competitive intelligence: benchmark creators and track launches
  • Market/academic research: study trends by niche, region, or language
  • Brand monitoring: find mentions and categorize sentiment downstream
  • Influencer discovery: filter by views/engagement within your topics

FAQ

How does the hybrid approach work? The scraper uses YouTube Data API v3 for instant core data (duration, views, likes, category, keywords) and web scraping for advanced fields (hashtags, description links, monetization flags). This ensures 100% field coverage with optimal speed and reliability.

Do I need residential proxy? No! The hybrid architecture works reliably without residential proxy. All 27 fields populate correctly using the API + scraping approach. Residential proxy is optional and only recommended for maximum reliability in production environments with very high volumes.

Can I target a country or language? Yes — set regionCode and language for localization.

What about Shorts and live videos? includeShorts controls Shorts. Live/live‑replay is auto‑detected via the type and isLive flags.

Can I filter by date? Yes — use dateFrom and dateTo (YYYY-MM-DD format) to filter videos published within a date range. Note: Date filtering only applies to Search Keywords, not Direct URLs.

Are dislikes available? No (YouTube no longer exposes them publicly). The field is returned as null.

Any limits or restrictions? The scraper respects YouTube's rate limits and uses intelligent throttling. Avoid abusive rates. Some content is age‑restricted or members‑only.


Best Practices

  1. Start small: validate with maxResultsPerQuery: 5–10
  2. Filter early: set includeShorts: false if not needed; use dateFrom/dateTo to narrow search results by publish date
  3. Bulk processing: use text file upload or remote file link for large URL lists
  4. Batch thoughtfully: group by topic/channel to improve cache locality
  5. Schema‑first: build downstream models against the stable keys listed above

Technical Details

Runtime

  • Node.js: 18+
  • Puppeteer: Headless Chrome with stealth mode
  • APIs: YouTube Data API v3 for hybrid extraction
  • Architecture: Optimized for scalability and reliability

Compliance

  • Intended for legitimate research & business intelligence
  • Collects only public information
  • Respects YouTube Data API Terms of Service
  • Respect YouTube Terms of Service & applicable laws in your jurisdiction

Changelog

  • v2.7 (Current):
    • 🚀 Removed captions field (requires residential proxy, always null on datacenter IPs)
    • ✅ Improved performance by eliminating caption API calls
    • ✅ Cleaner output schema with 27 fields
    • ✅ Removed hasSubtitles/hasCC from features (dependent on captions)
  • v2.6:
    • 🚀 NEW: Hybrid YouTube Data API v3 + web scraping architecture
    • ✅ Intelligent multi-project API management for optimal performance
    • ✅ 100% field coverage without residential proxy
    • ✅ Dual fallback system for maximum reliability
    • ✅ ~4-6 seconds per video (without proxy)
    • ✅ Enhanced hashtags and description links extraction from API fallback
    • ✅ Improved duration extraction with 4 fallback methods
    • ✅ Residential proxy now optional
  • v2.5: Added date range filtering for search queries (dateFrom/dateTo), bulk URL upload support via text file or remote file link
  • v2.0: Performance tuning, richer channel fields, improved localization & proxy options

Get Help

  • Issues & feature requests → GitHub / Apify support
  • Need something custom? Open an issue describing your dataset needs