YouTube Comment Scraper That Won’t Quit avatar

YouTube Comment Scraper That Won’t Quit

Pricing

from $1.00 / 1,000 results

Go to Apify Store
YouTube Comment Scraper That Won’t Quit

YouTube Comment Scraper That Won’t Quit

Bulk-extract YouTube comments (and replies) from video URLs with resilient pagination. Supports rotating alt instances and optional yt-dlp fallback for maximum uptime. Saves video metadata + comment records to dataset output — no API key.

Pricing

from $1.00 / 1,000 results

Rating

0.0

(0)

Developer

Inus Grobler

Inus Grobler

Maintained by Community

Actor stats

0

Bookmarked

5

Total users

1

Monthly active users

9 days ago

Last modified

Share

YouTube Comments Scraper — Extract Comments & Replies from Any YouTube Video

Scrape YouTube comments and threaded replies at scale from any YouTube video URL. This Apify Actor uses multiple Invidious mirror instances for reliable extraction and automatically falls back to yt-dlp when mirrors are unavailable — so you always get data.

✅ No YouTube API key required  |  ✅ Handles pagination  |  ✅ Extracts replies  |  ✅ Structured JSON output


What Can This Actor Do?

  • Scrape all comments from one or more YouTube videos (up to 100,000 comments per video)
  • Include full reply threads with recursive continuation pagination
  • Auto-rotate across 10+ Invidious mirrors for maximum uptime
  • Fall back to yt-dlp when mirrors are blocked or rate-limited
  • Run videos in parallel — multiple videos are scraped concurrently
  • Stream data progressively — results are saved as each video completes, so you never lose data on timeout
  • Route traffic through Apify Residential Proxies (US default) to bypass YouTube IP blocks

Use Cases

GoalHow This Helps
Sentiment analysisCollect comments at scale for NLP pipelines and toxicity scoring
Brand monitoringTrack what people say about your product or channel
Audience researchUnderstand viewer reactions, pain points, and questions
Competitive intelligenceAnalyse comments on competitor videos
Content moderationBuild training datasets for comment classifiers
Academic researchExtract public conversation data for social science studies

Quick Start

  1. Add video URLs — paste one or more YouTube URLs into videoUrls
  2. Set comment limits — use maxCommentsPerVideo: 0 for all comments, or set a specific number
  3. Enable replies — toggle includeReplies: true to include reply threads
  4. Run — the actor streams results to your dataset as each video completes

Minimal Input Example

{
"videoUrls": ["https://www.youtube.com/watch?v=dQw4w9WgXcQ"],
"maxCommentsPerVideo": 500,
"includeReplies": true
}

Full Input Example

{
"videoUrls": [
"https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"https://www.youtube.com/watch?v=T-HZHO_PQPY"
],
"maxCommentsPerVideo": 0,
"maxCommentPagesPerVideo": 100,
"includeReplies": true,
"fallbackToYtDlpComments": true,
"timeoutSec": 20,
"proxyConfiguration": {
"useApifyProxy": true,
"apifyProxyGroups": ["RESIDENTIAL"],
"apifyProxyCountry": "US"
},
"altInstances": [
"https://iv.melmac.space",
"https://yewtu.be",
"https://inv.nadeko.net"
]
}

Input Reference

FieldTypeDefaultDescription
videoUrlsstring[]Required. YouTube watch URLs or 11-character video IDs
maxCommentsPerVideointeger20Max comments to collect per video. 0 = unlimited (safety cap: 100,000)
maxCommentPagesPerVideointeger2Max comment pages per video (safety cap: 1,000)
includeRepliesbooleanfalseInclude reply threads beneath each top-level comment
fallbackToYtDlpCommentsbooleantrueUse yt-dlp to fetch comments if all Invidious mirrors fail
timeoutSecinteger20Per-request HTTP timeout in seconds (minimum: 5)
actorTimeoutSecinteger0Overall run budget in seconds. 0 = auto-detect from platform. The scraper stops 45 s early to flush data safely
proxyConfigurationobjectUS ResidentialApify proxy settings. US Residential recommended to avoid YouTube IP blocks
altInstancesstring[]10 mirrorsOrdered list of Invidious-compatible base URLs to try

Output

The actor pushes two record types to the default dataset:

Video Record (record_type: "video")

One row per input video — scraped before comments are collected.

{
"record_type": "video",
"video_id": "dQw4w9WgXcQ",
"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"title": "Rick Astley - Never Gonna Give You Up (Official Music Video)",
"channel_id": "UCuAXFkgsw1L7xaCfnd5JJOw",
"channel_name": "Rick Astley",
"channel_handle": "RickAstleyYT",
"channel_url": "https://www.youtube.com/channel/UCuAXFkgsw1L7xaCfnd5JJOw",
"view_count": 1600000000,
"like_count": 16000000,
"comment_count": 2800000,
"published_at": "1987-07-27T00:00:00Z",
"scraped_at": "2026-04-20T18:00:00Z"
}

Comment Record (record_type: "comment")

One row per comment or reply. Replies have depth: 1 and a non-empty parent_comment_id.

{
"record_type": "comment",
"video_id": "dQw4w9WgXcQ",
"video_title": "Rick Astley - Never Gonna Give You Up",
"comment_id": "UgxABC123",
"parent_comment_id": "",
"depth": 0,
"author_name": "Jane Smith",
"author_handle": "janesmith",
"author_channel_id": "UCxxxxxx",
"text": "This song never gets old!",
"like_count": 4200,
"reply_count": 31,
"published_text": "3 years ago",
"published_at": "2023-01-15T10:30:00Z",
"is_pinned": false,
"is_hearted": true,
"author_is_channel_owner": false,
"scraped_at": "2026-04-20T18:00:00Z"
}

Key-Value Store

  • OUTPUT — run summary containing meta, input, totals, and warnings

How It Works

Input URLs
Extract video metadata (Invidious API → yt-dlp fallback)
Scrape comment pages with continuation tokens
├─ Try Invidious mirror 1 → mirror 2 → … mirror N
(rotates until sufficient comments found)
├─ For each comment: collect inline replies
+ follow reply continuation pages
└─ If all mirrors fail → yt-dlp fallback (direct YouTube)
Stream results to dataset (per-video, progressive)

All videos are scraped concurrently. Data is pushed to your dataset as each video completes — a timeout will never erase already-collected results.


Proxy Configuration

YouTube aggressively blocks requests from datacenter IP ranges (which Apify's default infrastructure uses). This causes empty results or Sign in to confirm you're not a bot errors.

Solution: US Residential Proxies (the default prefill) route requests through real home IP addresses, making the scraper appear as a regular user.

Proxy TypeReliabilityCost
Residential (US) ✅ recommendedHigh — household IPs, rarely blockedHigher
DatacenterLow — often blocked by YouTubeLower
NoneVery low on Apify IPsFree

The proxy applies to all requests: Invidious API calls, yt-dlp, and comment page fetches.


Trigger via API

You can start a run programmatically from any language or CI/CD pipeline:

curl -X POST \
"https://api.apify.com/v2/acts/<USERNAME>~youtube-comments-scraper-alt/runs?token=<APIFY_TOKEN>" \
-H "Content-Type: application/json" \
-d '{
"videoUrls": ["https://www.youtube.com/watch?v=dQw4w9WgXcQ"],
"maxCommentsPerVideo": 1000,
"includeReplies": true,
"fallbackToYtDlpComments": true
}'

Then read the results from the dataset URL returned in the run response.


Troubleshooting

No comments returned / empty titles

The most common cause is YouTube blocking datacenter IPs. Fix: enable Residential Proxies in proxyConfiguration.

comments status ... unavailable

All Invidious mirrors failed and yt-dlp is either disabled or also blocked. Try:

  • Enable fallbackToYtDlpComments: true
  • Add more working mirrors to altInstances
  • Enable residential proxy

Fewer comments than expected

  • Increase maxCommentPagesPerVideo (default is low for test runs)
  • Set maxCommentsPerVideo: 0 for no limit
  • Check the warnings field in OUTPUT for debug errors

Run timed out

  • Increase the run timeout in Input → Run options (or it defaults to 1 hour from actor.json)
  • The scraper automatically stops 45 s before the deadline and saves everything collected so far

altInstances rejected

All provided URLs were filtered out. Ensure they are public http:// or https:// hosts — not localhost, 192.168.x.x, or private ranges.


Limitations

  • Not affiliated with or endorsed by YouTube or Google
  • Returned fields depend on what each Invidious mirror exposes — some fields may be null
  • Comments on age-restricted or private videos cannot be retrieved
  • Very large videos (millions of comments) may hit the 100,000 per-video safety cap

FAQ

Does this require a YouTube API key? No. It uses public Invidious mirror APIs and yt-dlp, neither of which requires credentials.

Can I scrape multiple videos at once? Yes — add multiple URLs to videoUrls. They run in parallel.

How do I get replies? Set includeReplies: true. The scraper follows reply continuation tokens for full thread depth.

What's the difference between timeoutSec and actorTimeoutSec? timeoutSec is the HTTP timeout per individual network request (default 20 s). actorTimeoutSec is the total wall-clock budget for the entire run — when set to 0, it auto-detects the Apify platform timeout.

Is the data saved if the actor times out? Yes. Data is pushed to your dataset as each video completes. If the run is killed mid-way, all fully-processed videos are already in the dataset.