YouTube Comment Scraper That Won’t Quit avatar

YouTube Comment Scraper That Won’t Quit

Pricing

from $0.75 / 1,000 results

Go to Apify Store
YouTube Comment Scraper That Won’t Quit

YouTube Comment Scraper That Won’t Quit

Bulk-extract YouTube comments (and replies) from video URLs with resilient pagination. Supports rotating alt instances and optional yt-dlp fallback for maximum uptime. Saves video metadata + comment records to dataset output — no API key.

Pricing

from $0.75 / 1,000 results

Rating

0.0

(0)

Developer

Inus Grobler

Inus Grobler

Maintained by Community

Actor stats

1

Bookmarked

9

Total users

4

Monthly active users

14 days ago

Last modified

Share

YouTube Comments Scraper — Extract Comments & Replies from Any YouTube Video

Scrape YouTube comments and threaded replies at scale from one or more YouTube video URLs or video IDs. This Apify Actor uses multiple Invidious mirror instances for resilient extraction and can automatically fall back to yt-dlp when mirrors are unavailable.

✅ No YouTube API key required  |  ✅ Handles pagination  |  ✅ Extracts replies  |  ✅ Structured JSON output


What Can This Actor Do?

  • Scrape all comments from one or more YouTube videos (up to 100,000 comments per video)
  • Include full reply threads with recursive continuation pagination
  • Auto-rotate across multiple Invidious mirrors for maximum uptime
  • Fall back to yt-dlp when mirrors are blocked or rate-limited
  • Run videos in parallel — concurrency is configurable to balance speed, RAM, and proxy/API pressure
  • Stream data progressively — each completed video is saved before the actor moves on
  • Route traffic through Apify Residential Proxies (US default) to bypass YouTube IP blocks

Use Cases

GoalHow This Helps
Sentiment analysisCollect comments at scale for NLP pipelines and toxicity scoring
Brand monitoringTrack what people say about your product or channel
Audience researchUnderstand viewer reactions, pain points, and questions
Competitive intelligenceAnalyse comments on competitor videos
Content moderationBuild training datasets for comment classifiers
Academic researchExtract public conversation data for social science studies

Quick Start

  1. Add video URLs or IDs — paste YouTube watch URLs, Shorts URLs, youtu.be links, embed links, or raw 11-character video IDs into videoUrls
  2. Set comment limits — use maxCommentsPerVideo: 0 for all comments, or set a specific number
  3. Enable replies — toggle includeReplies: true to include reply threads
  4. Run — the actor streams results to your dataset as each video completes

Minimal Input Example

{
"videoUrls": ["dQw4w9WgXcQ"],
"maxCommentsPerVideo": 500,
"includeReplies": true
}

Full Input Example

{
"videoUrls": [
"https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"https://youtu.be/T-HZHO_PQPY",
"dQw4w9WgXcQ"
],
"maxCommentsPerVideo": 0,
"maxCommentPagesPerVideo": 100,
"includeReplies": true,
"maxVideosInParallel": 2,
"proxyConfiguration": {
"useApifyProxy": true,
"apifyProxyGroups": ["RESIDENTIAL"],
"apifyProxyCountry": "US"
}
}

Input Reference

FieldTypeDefaultDescription
videoUrlsstring[]Required. YouTube watch URLs, Shorts URLs, youtu.be links, embed links, or raw 11-character video IDs
maxCommentsPerVideointeger20Max comments to collect per video. 0 = unlimited (safety cap: 100,000)
maxCommentPagesPerVideointeger2Max comment pages per video (safety cap: 1,000)
includeRepliesbooleanfalseInclude reply threads beneath each top-level comment
maxVideosInParallelinteger2Max videos to scrape concurrently. Lower values reduce RAM usage and proxy/API pressure
proxyConfigurationobjectUS ResidentialApify proxy settings. US Residential recommended to avoid YouTube IP blocks

Output

The actor pushes two record types to the default dataset:

Video Record (record_type: "video")

One row per input video — scraped before comments are collected.

{
"record_type": "video",
"video_id": "dQw4w9WgXcQ",
"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"title": "Rick Astley - Never Gonna Give You Up (Official Music Video)",
"channel_id": "UCuAXFkgsw1L7xaCfnd5JJOw",
"channel_name": "Rick Astley",
"channel_handle": "RickAstleyYT",
"channel_url": "https://www.youtube.com/channel/UCuAXFkgsw1L7xaCfnd5JJOw",
"view_count": 1600000000,
"like_count": 16000000,
"comment_count": 2800000,
"published_at": "1987-07-27T00:00:00Z",
"scraped_at": "2026-04-20T18:00:00Z"
}

Comment Record (record_type: "comment")

One row per comment or reply. Replies have depth: 1 and a non-empty parent_comment_id.

{
"record_type": "comment",
"video_id": "dQw4w9WgXcQ",
"video_title": "Rick Astley - Never Gonna Give You Up",
"comment_id": "UgxABC123",
"parent_comment_id": "",
"depth": 0,
"author_name": "Jane Smith",
"author_handle": "janesmith",
"author_channel_id": "UCxxxxxx",
"text": "This song never gets old!",
"like_count": 4200,
"reply_count": 31,
"published_text": "3 years ago",
"published_at": "2023-01-15T10:30:00Z",
"is_pinned": false,
"is_hearted": true,
"author_is_channel_owner": false,
"scraped_at": "2026-04-20T18:00:00Z"
}

Key-Value Store

  • OUTPUT — run summary containing meta, input, totals, and warnings

How It Works

Input URLs
Extract video metadata (Invidious API → yt-dlp fallback)
Scrape comment pages with continuation tokens
├─ Try Invidious mirror 1 → mirror 2 → … mirror N
(rotates until sufficient comments found)
├─ For each comment: collect inline replies
+ follow reply continuation pages
└─ If all mirrors fail → yt-dlp fallback (direct YouTube)
Stream results to dataset (per-video, progressive)

Videos are scraped concurrently up to maxVideosInParallel. Data is pushed to your dataset as each video completes, so completed videos remain available even if a later video runs into a timeout.


Proxy Configuration

YouTube aggressively blocks requests from datacenter IP ranges (which Apify's default infrastructure uses). This causes empty results or Sign in to confirm you're not a bot errors.

Solution: US Residential Proxies (the default prefill) route requests through real home IP addresses, making the scraper appear as a regular user.

Proxy TypeReliabilityCost
Residential (US) ✅ recommendedHigh — household IPs, rarely blockedHigher
DatacenterLow — often blocked by YouTubeLower
NoneVery low on Apify IPsFree

The proxy applies to all requests: Invidious API calls, yt-dlp, and comment page fetches.


Use With The Apify API

You can run the Actor from Python with the official Apify API client and read the dataset items after the run finishes.

$pip install apify-client
import os
from apify_client import ApifyClient
APIFY_TOKEN = os.environ["APIFY_TOKEN"]
ACTOR_ID = "thescrapelab/Apify-YouTube-Comment-Scraper-2-0"
client = ApifyClient(APIFY_TOKEN)
run_input = {
"videoUrls": [
"https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"https://youtu.be/T-HZHO_PQPY",
"dQw4w9WgXcQ",
],
"maxCommentsPerVideo": 500,
"maxCommentPagesPerVideo": 25,
"includeReplies": True,
"maxVideosInParallel": 2,
"proxyConfiguration": {
"useApifyProxy": True,
"apifyProxyGroups": ["RESIDENTIAL"],
"apifyProxyCountry": "US",
},
}
run = client.actor(ACTOR_ID).call(run_input=run_input)
if run is None:
raise RuntimeError("Actor run failed")
dataset_id = run["defaultDatasetId"]
items = client.dataset(dataset_id).list_items().items
videos = [item for item in items if item.get("record_type") == "video"]
comments = [item for item in items if item.get("record_type") == "comment"]
print(f"Videos: {len(videos)}")
print(f"Comments and replies: {len(comments)}")
for comment in comments[:5]:
print(comment["video_id"], comment.get("author_name"), comment.get("text"))

You can also start a run with a direct HTTP request:

curl -X POST \
"https://api.apify.com/v2/acts/thescrapelab~Apify-YouTube-Comment-Scraper-2-0/runs?waitForFinish=300&token=<APIFY_TOKEN>" \
-H "Content-Type: application/json" \
-d '{
"videoUrls": ["dQw4w9WgXcQ"],
"maxCommentsPerVideo": 1000,
"includeReplies": true
}'

After the run succeeds, read items from the defaultDatasetId returned in the run response.


Troubleshooting

No comments returned / empty titles

The most common cause is YouTube blocking datacenter IPs. Fix: enable Residential Proxies in proxyConfiguration.

comments status ... unavailable

All built-in comment sources were unavailable or blocked. Try:

  • Enable residential proxy

Fewer comments than expected

  • Increase maxCommentPagesPerVideo (default is low for test runs)
  • Set maxCommentsPerVideo: 0 to use the hard safety cap of 100,000 comments per video
  • Check the warnings field in OUTPUT for debug errors

Run timed out

  • Increase the run timeout in Run options. The default run timeout is 1 hour
  • The scraper automatically stops 45 s before the deadline and saves everything collected so far

Limitations

  • Not affiliated with or endorsed by YouTube or Google
  • Returned fields depend on what each Invidious mirror exposes — some fields may be null
  • Comments on age-restricted or private videos cannot be retrieved
  • Very large videos (millions of comments) may hit the 100,000 per-video safety cap

FAQ

Does this require a YouTube API key? No. It uses public Invidious mirror APIs and yt-dlp, neither of which requires credentials.

Can I scrape multiple videos at once? Yes — add multiple URLs or video IDs to videoUrls. They run in parallel up to maxVideosInParallel.

How do I get replies? Set includeReplies: true. The scraper follows reply continuation tokens for full thread depth.

Is the data saved if the actor times out? Yes. Data is pushed to your dataset as each video completes. If the run is killed mid-way, all fully-processed videos are already in the dataset.