Youtube Comment Scraper - Export to CSV/JSON
Pricing
from $0.40 / 1,000 comments
Youtube Comment Scraper - Export to CSV/JSON
[$0.5/1000] ✨YouTube comments scraper to scrape, extract, export YouTube comments to CSV/JSON. YouTube comment extractor, downloader, API alternative for scraping video comments, replies, likes, author data. Bulk YouTube comments download, dataset generator. Best YouTube comment scraper without API.
Pricing
from $0.40 / 1,000 comments
Rating
5.0
(1)
Developer
Epic Scrapers
Maintained by CommunityActor stats
2
Bookmarked
5
Total users
1
Monthly active users
6 days ago
Last modified
Categories
Share
YouTube Comments Scraper ⭐

From $0.50 / 1,000 comments — Scrape every comment, reply, like count, and author profile from any YouTube video at scale. The easiest way to extract structured YouTube comment data without an API key or login.
Built for youtube.com — the world's largest video platform with over 2.7 billion monthly active users and billions of comments generated daily.
Search by video URL, sort by top or newest, and control volume with max comments per video. Returns structured data with 17 fields per comment including text, likes, author profile, timestamps, and video metadata. Up to unlimited comments per run. No login required. No API key needed.
🚀 Features
- 🎯 Video URL input — Provide one or hundreds of YouTube video URLs. The actor processes them all in a single run, no batching needed.
- ⬆️⬇️ Sort by Top or Newest — Choose between the highest-engaged comments or the most recent discussion. Critical for sentiment analysis vs. real-time monitoring use cases.
- 🎛️ Configurable comment limit — Set a maximum per video (e.g. 100, 500, 1000) or pass
0to scrape every single comment, even on videos with millions. - 🔄 Auto-pagination — The actor automatically follows pagination across all comment threads and reply chains. No manual page clicking or cursor handling required.
- 🛡️ Residential proxy support — Uses Apify's residential proxy network to avoid rate limits and IP blocking during large-scale extractions.
- 📋 17-field structured output — Every comment returns text, like count, author ID, author name, author thumbnail, author URL, verified status, uploader badge, favorited status, timestamp (Unix + human-readable), parent comment ID, comment ID, video ID, video URL, and video title.
- 🔗 Reply threading preserved — Top-level comments and replies are linked through the
parentfield so you can reconstruct full conversation threads. - 📦 Multiple videos, one run — Pass an array of URLs in
startUrls. The actor sequentially scrapes each video and aggregates everything into a single clean dataset. - ⚡ No API key needed — Works entirely through yt-dlp's extractors on public YouTube data. No YouTube Data API v3 quota, no OAuth setup, no API costs.
📋 What You Get
Every scraped comment returns 17 fields of structured data:
| Field | Type | Description | Example |
|---|---|---|---|
id | string | Unique YouTube comment identifier | UgzNF6YV0mlS2GWceOZ4AaABAg |
parent | string | Parent comment ID; "root" for top-level comments | root |
text | string | Full text content of the comment, including emojis and line breaks | "I didn't get rickrolled today, I just really enjoy this song" |
like_count | integer | Number of likes the comment has received | 363000 |
author_id | string | Unique YouTube channel ID of the comment author | UCA1oEnRYKBmFDAcoKso2pAA |
author | string | Display name of the YouTube channel | @CinematicCaptures |
author_thumbnail | string | URL of the author's profile picture (88×88 pixels) | https://yt3.ggpht.com/... |
author_is_uploader | boolean | Whether the author is the video uploader | false |
author_is_verified | boolean | Whether the author's channel is verified by YouTube | true |
author_url | string | Direct URL to the comment author's YouTube channel | https://www.youtube.com/@CinematicCaptures |
is_favorited | boolean | Whether the comment was marked as a favorite by the uploader | false |
_time_text | string | Human-readable relative time string | 5 years ago |
timestamp | `integer | null` | Unix timestamp (seconds) when the comment was posted |
video_id | string | YouTube video ID the comment belongs to | dQw4w9WgXcQ |
video_url | string | Full URL of the YouTube video | https://www.youtube.com/watch?v=dQw4w9WgXcQ |
video_title | string | Title of the YouTube video | "Rick Astley - Never Gonna Give You Up (Official Video) (4K Remaster)" |
💰 Pricing
$0.50 per 1,000 comments scraped. You only pay for the compute units consumed during the run.
| Tier | Comments Scraped | Estimated Compute Units | Estimated Cost |
|---|---|---|---|
| Small extraction | 500 | ~0.3 CU | ~$0.15 |
| Medium extraction | 5,000 | ~2.0 CU | ~$1.00 |
| Large extraction | 50,000 | ~15.0 CU | ~$7.50 |
| Massive extraction | 500,000 | ~120.0 CU | ~$60.00 |
Note: Actual compute unit consumption depends on video length, comment thread depth, and network conditions. The Apify platform charges per compute unit, and unused platform credits from your plan apply.
📥 Input
| Input | Type | Required | Default | Description |
|---|---|---|---|---|
startUrls | array | ✅ Yes | — | One or more YouTube video URLs to scrape comments from |
sortCommentsBy | string | ❌ No | "top" | Sort order: "top" for highest-engaged first, "new" for most recent first |
maxCommentsPerUrl | integer | ❌ No | 100 | Maximum comments to scrape per video URL. Set to 0 for unlimited |
Example input:
{"maxCommentsPerUrl": 100,"sortCommentsBy": "top","startUrls": ["https://www.youtube.com/watch?v=dQw4w9WgXcQ","https://www.youtube.com/watch?v=9bZkp7q19f0"]}
Example output (single comment):
{"id": "UgzNF6YV0mlS2GWceOZ4AaABAg","parent": "root","text": "I didn't get rickrolled today, I just really enjoy this song","like_count": 363000,"author_id": "UCA1oEnRYKBmFDAcoKso2pAA","author": "@CinematicCaptures","author_thumbnail": "https://yt3.ggpht.com/ytc/AIdro_m2p2NbssAK79AX2uQzEvNPnnEIAIlB-PdN9Q4GSiqP7k8=s88-c-k-c0x00ffffff-no-rj","author_is_uploader": false,"author_is_verified": true,"author_url": "https://www.youtube.com/@CinematicCaptures","is_favorited": false,"_time_text": "5 years ago","timestamp": 1620950400,"video_id": "dQw4w9WgXcQ","video_url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ","video_title": "Rick Astley - Never Gonna Give You Up (Official Video) (4K Remaster)"}
💡 Use Cases
📊 Audience Sentiment Analysis for Content Creators
A YouTube creator with a channel of 500+ videos wants to understand how audience sentiment has shifted over the last year. Manually reading through comment sections is impossible at scale. Using this actor, the creator passes their entire video library as startUrls and sets maxCommentsPerUrl to 0 for unlimited extraction across all videos.
The resulting dataset contains every comment with its text, like_count, and timestamp. By feeding this into a sentiment classification pipeline (or even a spreadsheet with keyword scoring), the creator can track positive vs. negative sentiment per video, per month, or per content category. The like_count field weights comments by engagement, so a highly-upvoted critical comment carries more signal than an obscure one.
The outcome: data-driven content strategy. The creator knows exactly which topics, formats, and styles generate the most positive audience reaction — and which ones frustrate viewers.
🏢 Competitive Intelligence for Brand Managers
A brand manager at a consumer electronics company needs to track what people are saying about their latest product launch on YouTube. They also want to compare sentiment against two competitor launches from the same month.
The manager collects video URLs for their own product review videos, plus the competitors' top review videos. They run the actor across all URLs with sortCommentsBy: "new" to capture the freshest discussion. Each comment comes with text, author, author_is_verified, and like_count. Verified reviewers' opinions can be weighted more heavily. The timestamp field enables day-by-day sentiment trend lines.
Beyond simple sentiment, the manager searches comment text for specific feature mentions ("battery life", "camera quality", "price"). The structured export makes this trivially filterable in a spreadsheet or BI tool. The outcome: a competitive intelligence report built in hours, not weeks, with actual audience voice — not curated review summaries.
🎓 Academic Research on Online Discourse
A computational social science researcher is studying how political discourse differs across comment sections of news channels vs. independent creators. They need a large, representative sample of comments from at least 50 channels across the ideological spectrum.
The researcher compiles 200+ video URLs from relevant channels and runs the actor with maxCommentsPerUrl: 500 to get a balanced sample per video. The resulting dataset includes parent (for thread structure), author_id (for author-level analysis while preserving pseudonymity), timestamp, and like_count. The _time_text field helps identify viral spikes and brigading events.
With 100,000+ structured comments in a single dataset, the researcher can run network analysis on reply threads, measure engagement inequality across channels, and test hypotheses about echo chamber effects — all without scraping a single page manually or hitting YouTube's API quota limits.
📢 Community Management & Moderation at Scale
A social media agency managing 20+ YouTube channels for different clients needs a daily feed of new comments across all channels to catch emerging issues, harassment, or PR crises early.
They set up a recurring scheduled run of the actor (via Apify's scheduler or API) targeting their client videos with sortCommentsBy: "new" and maxCommentsPerUrl: 200. Each run captures the most recent comments, and the author_is_uploader flag helps distinguish creator replies from audience comments. The author_url field provides a direct link to investigate suspicious accounts.
The agency integrates the output with a moderation dashboard using Apify's webhook or API features. When a comment with high like_count contains negative keywords, it triggers an alert. The outcome: proactive community management that catches problems before they snowball, without a team of moderators refreshing comment pages all day.
🎯 Talent Scouting in Niche Communities
A music A&R executive believes the next breakout artist will be discovered through YouTube comment sections of niche music gear review channels. Guitarists, producers, and vocalists frequently share their work in comments, and eagle-eyed scouts can spot raw talent months before traditional discovery channels.
The executive targets 50 popular gear review and tutorial channels, collecting URLs for their most-watched videos. Running the actor with maxCommentsPerUrl: 1000 yields tens of thousands of comments. The author_url field becomes the key signal — each commenter has a direct link to their own YouTube channel. By filtering for comments with high like_count from non-verified accounts, the executive surfaces creators whose channels have organic engagement.
The author_id field enables deduplication to find the same commenter appearing across multiple videos — a strong signal of genuine passion in the space. The outcome: a talent discovery pipeline powered by actual community engagement data, not algorithm recommendations.
🧠 Machine Learning Dataset Construction
A machine learning engineer is building a toxicity classifier for social media comments and needs a diverse, labeled dataset of YouTube comments spanning multiple languages, topics, and engagement levels.
The engineer selects 1,000 videos across categories (gaming, news, music, education, sports) and runs the actor with maxCommentsPerUrl: 0 to capture everything. The resulting dataset yields millions of comments with text, like_count, author, and timestamp. The author_is_verified and author_is_uploader flags help stratify the dataset by user type. The parent field identifies replies, which can be paired with their parent comments for conversational context.
By sampling across like_count percentiles, the engineer ensures the dataset includes both high-engagement (controversial or popular) and low-engagement comments. The structured JSON output is ready for integration with datasets library for Hugging Face, PyTorch DataLoader pipelines, or export to CSV for lightweight labeling projects. The outcome: a production-grade training dataset built in days instead of months, for a fraction of the cost of YouTube Data API v3 quota.
❓ Frequently Asked Questions
How do I specify which videos to scrape?
Pass an array of full YouTube video URLs in the startUrls field. Each URL should be a standard watch URL (https://www.youtube.com/watch?v=VIDEO_ID) or a short URL (https://youtu.be/VIDEO_ID). You can include any number of URLs — the actor processes them sequentially in a single run and outputs all comments into one unified dataset.
What's the maximum number of comments I can scrape per video?
Unlimited. Set maxCommentsPerUrl to 0, and the actor will attempt to scrape every comment on the video, including all replies. For videos with millions of comments, the run will consume more compute units proportionally. For most use cases, we recommend starting with a limit of 500–1,000 to balance data quality with cost.
Do I need a YouTube API key or to log in?
No. The actor accesses publicly available YouTube comment data using yt-dlp's extractors — no OAuth, no API key, no login required. This means you never have to worry about YouTube Data API v3 quota limits (which cap at 10,000 requests per day on a standard key) or manage authentication tokens.
Does this scrape replies to comments, or just top-level comments?
Both. The actor scrapes top-level comments and their replies. Each reply includes a parent field containing the ID of the parent comment. Top-level comments have parent set to "root". This makes it straightforward to reconstruct full conversation threads or filter for just top-level comments.
What geographic regions does the proxy cover?
The actor uses Apify's residential proxy network with US-based exit nodes by default. If you need proxies from a different country (e.g., to access region-locked comment sections or avoid geo-specific rate limits), the proxy configuration can be adjusted in the actor source. For most YouTube videos, comment data is globally accessible regardless of proxy location.
How fresh is the data?
The actor pulls live data from YouTube every time it runs. Comments are scraped in real-time — there is no cached or stale dataset. If you run the actor on the same video one week apart, you will see the new comments posted in the interim. For recurring monitoring, schedule the actor to run daily or weekly via the Apify Console scheduler.
What export formats are available?
Apify datasets can be exported in multiple formats: JSON, CSV, Excel (XLSX), XML, HTML table, and RSS. You can download directly from the Apify Console, stream via the Apify API, or push data to 200+ integrations including Google Sheets, Airtable, BigQuery, S3, PostgreSQL, and Slack.
How is this different from using the YouTube Data API directly?
The YouTube Data API v3 has significant limitations: a daily quota of 10,000 units (a single commentThread.list call costs 1 unit, and pagination costs more), max 100 comments per page, no easy way to extract replies without additional calls, and no built-in proxy rotation. This actor handles all of that automatically — pagination, reply threading, proxy management, and retry logic — while costing a fraction of what you'd spend engineering and maintaining your own YouTube API integration.
📚 Technical Details
How It Works
The actor uses yt-dlp, the popular YouTube extraction library, operating through Apify's residential proxy network. On each run, it takes your input configuration, iterates through each video URL, and extracts comment data using yt-dlp's getcomments mode. The extraction happens server-side on Apify's infrastructure — no browser rendering needed, resulting in fast, lightweight runs. Each comment is pushed as a structured record to the Apify dataset, from which you can export, stream, or integrate the data.
Error Handling
- Network errors and timeouts — Retried automatically up to 3 times per video via yt-dlp's
extractor_retriessetting before failing gracefully. - Invalid or private video URLs — The actor logs a warning and continues to the next URL in the
startUrlsarray. No data is pushed for inaccessible videos. - Proxy failures — If the residential proxy is unavailable, the actor raises a clear error before attempting any extraction. Check your Apify proxy settings and region availability.
- Rate limiting — Apify's residential proxy rotation handles YouTube's rate limiting transparently. For extremely large extractions (500K+ comments), the proxy pool absorbs the load.
- Empty comment sections — Videos with comments disabled or zero comments are logged with a "No comments found" message and skipped cleanly.
Data Integrity
- No duplicate comments — Each comment's
idis a YouTube-assigned unique identifier. Deduplication is trivial when combining multiple runs. - Original formatting preserved — Comment
textincludes emojis, line breaks, and special characters as they appear on YouTube. No sanitization or truncation. - Reply threading guaranteed — Every comment record includes a
parentfield. Top-level comments always haveparent: "root". Replies always contain a valid parent comment ID present in the same dataset. - Timestamp precision — The
timestampfield is a Unix epoch in seconds. The_time_textfield is a human-readable relative string for display; always usetimestampfor programmatic sorting and filtering. - Author permanence —
author_idis YouTube's permanent channel identifier and will not change even if the user changes their display name (author).
SEO Keywords
YouTube comments scraper, YouTube comment extractor, scrape YouTube comments, YouTube data extraction, YouTube comment downloader, YouTube comments to CSV, YouTube comments to JSON, YouTube comments dataset, YouTube comment analysis, YouTube sentiment analysis, YouTube comment scraper Apify, YouTube comments API alternative, extract video comments, YouTube comment export, YouTube comment mining, bulk YouTube comments, YouTube comment collector, YouTube audience insights, YouTube comment research, YouTube comment moderation, YouTube comment dataset generator, YouTube channel scraper, social media scraper, YouTube engagement analysis, YouTube comment NLP dataset, YouTube video scraper, no API YouTube scraper, YouTube data mining, YouTube comment scrape tool, YouTube comment export tool, YouTube comment CSV export, YouTube comment JSON export, video comment scraper, YouTube reply scraper, YouTube thread scraper, YouTube comments without API, YouTube Data API alternative, YouTube comment extractor Apify, YouTube scraping actor, YouTube comment dataset for machine learning
⚠️ Disclaimer
This Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by YouTube, Google LLC, or any of their subsidiaries. All trademarks are the property of their respective owners.
This Actor accesses only publicly available comment data on youtube.com. You are solely responsible for ensuring your use complies with YouTube's Terms of Service and applicable laws.
