Youtube Comment Scraper - Export to CSV/JSON avatar

Youtube Comment Scraper - Export to CSV/JSON

Pricing

from $0.40 / 1,000 comments

Go to Apify Store
Youtube Comment Scraper - Export to CSV/JSON

Youtube Comment Scraper - Export to CSV/JSON

[$0.5/1000] ✨YouTube comments scraper to scrape, extract, export YouTube comments to CSV/JSON. YouTube comment extractor, downloader, API alternative for scraping video comments, replies, likes, author data. Bulk YouTube comments download, dataset generator. Best YouTube comment scraper without API.

Pricing

from $0.40 / 1,000 comments

Rating

5.0

(1)

Developer

Epic Scrapers

Epic Scrapers

Maintained by Community

Actor stats

2

Bookmarked

5

Total users

1

Monthly active users

6 days ago

Last modified

Share

YouTube Comments Scraper ⭐

YouTube Comments Scraper Banner

From $0.50 / 1,000 comments — Scrape every comment, reply, like count, and author profile from any YouTube video at scale. The easiest way to extract structured YouTube comment data without an API key or login.

Built for youtube.com — the world's largest video platform with over 2.7 billion monthly active users and billions of comments generated daily.

Search by video URL, sort by top or newest, and control volume with max comments per video. Returns structured data with 17 fields per comment including text, likes, author profile, timestamps, and video metadata. Up to unlimited comments per run. No login required. No API key needed.

🚀 Features

  • 🎯 Video URL input — Provide one or hundreds of YouTube video URLs. The actor processes them all in a single run, no batching needed.
  • ⬆️⬇️ Sort by Top or Newest — Choose between the highest-engaged comments or the most recent discussion. Critical for sentiment analysis vs. real-time monitoring use cases.
  • 🎛️ Configurable comment limit — Set a maximum per video (e.g. 100, 500, 1000) or pass 0 to scrape every single comment, even on videos with millions.
  • 🔄 Auto-pagination — The actor automatically follows pagination across all comment threads and reply chains. No manual page clicking or cursor handling required.
  • 🛡️ Residential proxy support — Uses Apify's residential proxy network to avoid rate limits and IP blocking during large-scale extractions.
  • 📋 17-field structured output — Every comment returns text, like count, author ID, author name, author thumbnail, author URL, verified status, uploader badge, favorited status, timestamp (Unix + human-readable), parent comment ID, comment ID, video ID, video URL, and video title.
  • 🔗 Reply threading preserved — Top-level comments and replies are linked through the parent field so you can reconstruct full conversation threads.
  • 📦 Multiple videos, one run — Pass an array of URLs in startUrls. The actor sequentially scrapes each video and aggregates everything into a single clean dataset.
  • ⚡ No API key needed — Works entirely through yt-dlp's extractors on public YouTube data. No YouTube Data API v3 quota, no OAuth setup, no API costs.

📋 What You Get

Every scraped comment returns 17 fields of structured data:

FieldTypeDescriptionExample
idstringUnique YouTube comment identifierUgzNF6YV0mlS2GWceOZ4AaABAg
parentstringParent comment ID; "root" for top-level commentsroot
textstringFull text content of the comment, including emojis and line breaks"I didn't get rickrolled today, I just really enjoy this song"
like_countintegerNumber of likes the comment has received363000
author_idstringUnique YouTube channel ID of the comment authorUCA1oEnRYKBmFDAcoKso2pAA
authorstringDisplay name of the YouTube channel@CinematicCaptures
author_thumbnailstringURL of the author's profile picture (88×88 pixels)https://yt3.ggpht.com/...
author_is_uploaderbooleanWhether the author is the video uploaderfalse
author_is_verifiedbooleanWhether the author's channel is verified by YouTubetrue
author_urlstringDirect URL to the comment author's YouTube channelhttps://www.youtube.com/@CinematicCaptures
is_favoritedbooleanWhether the comment was marked as a favorite by the uploaderfalse
_time_textstringHuman-readable relative time string5 years ago
timestamp`integernull`Unix timestamp (seconds) when the comment was posted
video_idstringYouTube video ID the comment belongs todQw4w9WgXcQ
video_urlstringFull URL of the YouTube videohttps://www.youtube.com/watch?v=dQw4w9WgXcQ
video_titlestringTitle of the YouTube video"Rick Astley - Never Gonna Give You Up (Official Video) (4K Remaster)"

💰 Pricing

$0.50 per 1,000 comments scraped. You only pay for the compute units consumed during the run.

TierComments ScrapedEstimated Compute UnitsEstimated Cost
Small extraction500~0.3 CU~$0.15
Medium extraction5,000~2.0 CU~$1.00
Large extraction50,000~15.0 CU~$7.50
Massive extraction500,000~120.0 CU~$60.00

Note: Actual compute unit consumption depends on video length, comment thread depth, and network conditions. The Apify platform charges per compute unit, and unused platform credits from your plan apply.

📥 Input

InputTypeRequiredDefaultDescription
startUrlsarray✅ YesOne or more YouTube video URLs to scrape comments from
sortCommentsBystring❌ No"top"Sort order: "top" for highest-engaged first, "new" for most recent first
maxCommentsPerUrlinteger❌ No100Maximum comments to scrape per video URL. Set to 0 for unlimited

Example input:

{
"maxCommentsPerUrl": 100,
"sortCommentsBy": "top",
"startUrls": [
"https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"https://www.youtube.com/watch?v=9bZkp7q19f0"
]
}

Example output (single comment):

{
"id": "UgzNF6YV0mlS2GWceOZ4AaABAg",
"parent": "root",
"text": "I didn't get rickrolled today, I just really enjoy this song",
"like_count": 363000,
"author_id": "UCA1oEnRYKBmFDAcoKso2pAA",
"author": "@CinematicCaptures",
"author_thumbnail": "https://yt3.ggpht.com/ytc/AIdro_m2p2NbssAK79AX2uQzEvNPnnEIAIlB-PdN9Q4GSiqP7k8=s88-c-k-c0x00ffffff-no-rj",
"author_is_uploader": false,
"author_is_verified": true,
"author_url": "https://www.youtube.com/@CinematicCaptures",
"is_favorited": false,
"_time_text": "5 years ago",
"timestamp": 1620950400,
"video_id": "dQw4w9WgXcQ",
"video_url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"video_title": "Rick Astley - Never Gonna Give You Up (Official Video) (4K Remaster)"
}

💡 Use Cases

📊 Audience Sentiment Analysis for Content Creators

A YouTube creator with a channel of 500+ videos wants to understand how audience sentiment has shifted over the last year. Manually reading through comment sections is impossible at scale. Using this actor, the creator passes their entire video library as startUrls and sets maxCommentsPerUrl to 0 for unlimited extraction across all videos.

The resulting dataset contains every comment with its text, like_count, and timestamp. By feeding this into a sentiment classification pipeline (or even a spreadsheet with keyword scoring), the creator can track positive vs. negative sentiment per video, per month, or per content category. The like_count field weights comments by engagement, so a highly-upvoted critical comment carries more signal than an obscure one.

The outcome: data-driven content strategy. The creator knows exactly which topics, formats, and styles generate the most positive audience reaction — and which ones frustrate viewers.

🏢 Competitive Intelligence for Brand Managers

A brand manager at a consumer electronics company needs to track what people are saying about their latest product launch on YouTube. They also want to compare sentiment against two competitor launches from the same month.

The manager collects video URLs for their own product review videos, plus the competitors' top review videos. They run the actor across all URLs with sortCommentsBy: "new" to capture the freshest discussion. Each comment comes with text, author, author_is_verified, and like_count. Verified reviewers' opinions can be weighted more heavily. The timestamp field enables day-by-day sentiment trend lines.

Beyond simple sentiment, the manager searches comment text for specific feature mentions ("battery life", "camera quality", "price"). The structured export makes this trivially filterable in a spreadsheet or BI tool. The outcome: a competitive intelligence report built in hours, not weeks, with actual audience voice — not curated review summaries.

🎓 Academic Research on Online Discourse

A computational social science researcher is studying how political discourse differs across comment sections of news channels vs. independent creators. They need a large, representative sample of comments from at least 50 channels across the ideological spectrum.

The researcher compiles 200+ video URLs from relevant channels and runs the actor with maxCommentsPerUrl: 500 to get a balanced sample per video. The resulting dataset includes parent (for thread structure), author_id (for author-level analysis while preserving pseudonymity), timestamp, and like_count. The _time_text field helps identify viral spikes and brigading events.

With 100,000+ structured comments in a single dataset, the researcher can run network analysis on reply threads, measure engagement inequality across channels, and test hypotheses about echo chamber effects — all without scraping a single page manually or hitting YouTube's API quota limits.

📢 Community Management & Moderation at Scale

A social media agency managing 20+ YouTube channels for different clients needs a daily feed of new comments across all channels to catch emerging issues, harassment, or PR crises early.

They set up a recurring scheduled run of the actor (via Apify's scheduler or API) targeting their client videos with sortCommentsBy: "new" and maxCommentsPerUrl: 200. Each run captures the most recent comments, and the author_is_uploader flag helps distinguish creator replies from audience comments. The author_url field provides a direct link to investigate suspicious accounts.

The agency integrates the output with a moderation dashboard using Apify's webhook or API features. When a comment with high like_count contains negative keywords, it triggers an alert. The outcome: proactive community management that catches problems before they snowball, without a team of moderators refreshing comment pages all day.

🎯 Talent Scouting in Niche Communities

A music A&R executive believes the next breakout artist will be discovered through YouTube comment sections of niche music gear review channels. Guitarists, producers, and vocalists frequently share their work in comments, and eagle-eyed scouts can spot raw talent months before traditional discovery channels.

The executive targets 50 popular gear review and tutorial channels, collecting URLs for their most-watched videos. Running the actor with maxCommentsPerUrl: 1000 yields tens of thousands of comments. The author_url field becomes the key signal — each commenter has a direct link to their own YouTube channel. By filtering for comments with high like_count from non-verified accounts, the executive surfaces creators whose channels have organic engagement.

The author_id field enables deduplication to find the same commenter appearing across multiple videos — a strong signal of genuine passion in the space. The outcome: a talent discovery pipeline powered by actual community engagement data, not algorithm recommendations.

🧠 Machine Learning Dataset Construction

A machine learning engineer is building a toxicity classifier for social media comments and needs a diverse, labeled dataset of YouTube comments spanning multiple languages, topics, and engagement levels.

The engineer selects 1,000 videos across categories (gaming, news, music, education, sports) and runs the actor with maxCommentsPerUrl: 0 to capture everything. The resulting dataset yields millions of comments with text, like_count, author, and timestamp. The author_is_verified and author_is_uploader flags help stratify the dataset by user type. The parent field identifies replies, which can be paired with their parent comments for conversational context.

By sampling across like_count percentiles, the engineer ensures the dataset includes both high-engagement (controversial or popular) and low-engagement comments. The structured JSON output is ready for integration with datasets library for Hugging Face, PyTorch DataLoader pipelines, or export to CSV for lightweight labeling projects. The outcome: a production-grade training dataset built in days instead of months, for a fraction of the cost of YouTube Data API v3 quota.

❓ Frequently Asked Questions

How do I specify which videos to scrape?

Pass an array of full YouTube video URLs in the startUrls field. Each URL should be a standard watch URL (https://www.youtube.com/watch?v=VIDEO_ID) or a short URL (https://youtu.be/VIDEO_ID). You can include any number of URLs — the actor processes them sequentially in a single run and outputs all comments into one unified dataset.

What's the maximum number of comments I can scrape per video?

Unlimited. Set maxCommentsPerUrl to 0, and the actor will attempt to scrape every comment on the video, including all replies. For videos with millions of comments, the run will consume more compute units proportionally. For most use cases, we recommend starting with a limit of 500–1,000 to balance data quality with cost.

Do I need a YouTube API key or to log in?

No. The actor accesses publicly available YouTube comment data using yt-dlp's extractors — no OAuth, no API key, no login required. This means you never have to worry about YouTube Data API v3 quota limits (which cap at 10,000 requests per day on a standard key) or manage authentication tokens.

Does this scrape replies to comments, or just top-level comments?

Both. The actor scrapes top-level comments and their replies. Each reply includes a parent field containing the ID of the parent comment. Top-level comments have parent set to "root". This makes it straightforward to reconstruct full conversation threads or filter for just top-level comments.

What geographic regions does the proxy cover?

The actor uses Apify's residential proxy network with US-based exit nodes by default. If you need proxies from a different country (e.g., to access region-locked comment sections or avoid geo-specific rate limits), the proxy configuration can be adjusted in the actor source. For most YouTube videos, comment data is globally accessible regardless of proxy location.

How fresh is the data?

The actor pulls live data from YouTube every time it runs. Comments are scraped in real-time — there is no cached or stale dataset. If you run the actor on the same video one week apart, you will see the new comments posted in the interim. For recurring monitoring, schedule the actor to run daily or weekly via the Apify Console scheduler.

What export formats are available?

Apify datasets can be exported in multiple formats: JSON, CSV, Excel (XLSX), XML, HTML table, and RSS. You can download directly from the Apify Console, stream via the Apify API, or push data to 200+ integrations including Google Sheets, Airtable, BigQuery, S3, PostgreSQL, and Slack.

How is this different from using the YouTube Data API directly?

The YouTube Data API v3 has significant limitations: a daily quota of 10,000 units (a single commentThread.list call costs 1 unit, and pagination costs more), max 100 comments per page, no easy way to extract replies without additional calls, and no built-in proxy rotation. This actor handles all of that automatically — pagination, reply threading, proxy management, and retry logic — while costing a fraction of what you'd spend engineering and maintaining your own YouTube API integration.

📚 Technical Details

How It Works

The actor uses yt-dlp, the popular YouTube extraction library, operating through Apify's residential proxy network. On each run, it takes your input configuration, iterates through each video URL, and extracts comment data using yt-dlp's getcomments mode. The extraction happens server-side on Apify's infrastructure — no browser rendering needed, resulting in fast, lightweight runs. Each comment is pushed as a structured record to the Apify dataset, from which you can export, stream, or integrate the data.

Error Handling

  • Network errors and timeouts — Retried automatically up to 3 times per video via yt-dlp's extractor_retries setting before failing gracefully.
  • Invalid or private video URLs — The actor logs a warning and continues to the next URL in the startUrls array. No data is pushed for inaccessible videos.
  • Proxy failures — If the residential proxy is unavailable, the actor raises a clear error before attempting any extraction. Check your Apify proxy settings and region availability.
  • Rate limiting — Apify's residential proxy rotation handles YouTube's rate limiting transparently. For extremely large extractions (500K+ comments), the proxy pool absorbs the load.
  • Empty comment sections — Videos with comments disabled or zero comments are logged with a "No comments found" message and skipped cleanly.

Data Integrity

  • No duplicate comments — Each comment's id is a YouTube-assigned unique identifier. Deduplication is trivial when combining multiple runs.
  • Original formatting preserved — Comment text includes emojis, line breaks, and special characters as they appear on YouTube. No sanitization or truncation.
  • Reply threading guaranteed — Every comment record includes a parent field. Top-level comments always have parent: "root". Replies always contain a valid parent comment ID present in the same dataset.
  • Timestamp precision — The timestamp field is a Unix epoch in seconds. The _time_text field is a human-readable relative string for display; always use timestamp for programmatic sorting and filtering.
  • Author permanenceauthor_id is YouTube's permanent channel identifier and will not change even if the user changes their display name (author).

SEO Keywords

YouTube comments scraper, YouTube comment extractor, scrape YouTube comments, YouTube data extraction, YouTube comment downloader, YouTube comments to CSV, YouTube comments to JSON, YouTube comments dataset, YouTube comment analysis, YouTube sentiment analysis, YouTube comment scraper Apify, YouTube comments API alternative, extract video comments, YouTube comment export, YouTube comment mining, bulk YouTube comments, YouTube comment collector, YouTube audience insights, YouTube comment research, YouTube comment moderation, YouTube comment dataset generator, YouTube channel scraper, social media scraper, YouTube engagement analysis, YouTube comment NLP dataset, YouTube video scraper, no API YouTube scraper, YouTube data mining, YouTube comment scrape tool, YouTube comment export tool, YouTube comment CSV export, YouTube comment JSON export, video comment scraper, YouTube reply scraper, YouTube thread scraper, YouTube comments without API, YouTube Data API alternative, YouTube comment extractor Apify, YouTube scraping actor, YouTube comment dataset for machine learning

⚠️ Disclaimer

This Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by YouTube, Google LLC, or any of their subsidiaries. All trademarks are the property of their respective owners.

This Actor accesses only publicly available comment data on youtube.com. You are solely responsible for ensuring your use complies with YouTube's Terms of Service and applicable laws.