Youtube Comment Scraper avatar
Youtube Comment Scraper

Pricing

$8.50/month + usage

Go to Apify Store
Youtube Comment Scraper

Youtube Comment Scraper

Developed by

Neuro Scraper

Neuro Scraper

Maintained by Community

🎯 **Instant YouTube Comment Insights!** Fetch thousands of comments in seconds β€” perfect for research, analytics, marketing & content strategy. πŸš€ Zero setup, privacy-safe, enterprise-ready. πŸ”’ Discover what audiences *really think* β€” run now & get data instantly! βš‘πŸ“Š

0.0 (0)

Pricing

$8.50/month + usage

0

1

1

Last modified

2 days ago

YouTube Comments Scraper

🌟 YouTube Comments Scraper β€” Instant comment export for research, moderation, and analytics

πŸ“– Summary A production-ready Apify Actor that fetches public comments from one or more YouTube video URLs and pushes structured results to an Apify Dataset. Plug-and-play in Apify Console β€” get organized comment data in seconds.


πŸ’‘ Use cases / When to use

  • Export YouTube comments for sentiment analysis, moderation, user research, or training datasets.
  • Rapidly collect top / newest / oldest comments for reporting or manual review.
  • Integrate comment exports into larger Apify workflows or pipelines.

⚑ Quick Start (Console β€” one-click)

  1. Open this Actor in Apify Console.
  2. Paste one or more YouTube video URLs into the Input field (see Inputs below).
  3. Click Run β€” results land in the default Dataset within seconds.

βš™οΈ Quick Start (CLI + API)

apify-cli (run on Apify platform):

# Run with JSON input file
apify run --input input.example.json

apify-client (Python) β€” minimal snippet

from apify_client import ApifyClient
client = ApifyClient('<APIFY_TOKEN>')
run = client.actor('your-username/your-actor-name').call(run_input={
'startUrls': ['https://www.youtube.com/watch?v=VIDEO_ID'],
'maxComments': 50,
'getAllComments': False,
'sortBy': 'top'
})
print('Run started:', run['id'])

πŸ“ Inputs (fields & schema)

Provide a JSON object to the Actor. The most common field to set in Console is startUrls.

{
"startUrls": ["https://www.youtube.com/watch?v=VIDEO_ID"],
"maxComments": 100,
"getAllComments": false,
"sortBy": "top"
}

Field summary

  • startUrls β€” array|string β€” Required β€” one or more YouTube video URLs.
  • maxComments β€” integer β€” Optional β€” maximum number of comments to return per video when getAllComments is false.
  • getAllComments β€” boolean β€” Optional β€” if true, the Actor attempts to collect all available public comments for each video.
  • sortBy β€” string β€” Optional β€” top | newest | oldest. Defaults to top.

βš™οΈ Configuration

πŸ”‘ NameπŸ“ Type❓ Requiredβš™οΈ DefaultπŸ“Œ Example🧠 Notes
startUrlsarrayβœ… YesNone["https://...v=VIDEOID"]One or more YouTube video URLs
maxCommentsintegerβš™οΈ Optionalnull100Limit comments when getAllComments=false
getAllCommentsbooleanβš™οΈ OptionalfalsetrueSet to true to attempt full harvest (may take longer)
sortBystringβš™οΈ Optional"top""newest"top / newest / oldest
proxyConfigobjectβš™οΈ Optional{}{"useApifyProxy": true}Proxy settings β€” see Proxy Configuration

Example Console setup: Paste https://www.youtube.com/watch?v=VIDEO_ID into the Input field under startUrls, adjust maxComments if you like, then click Run Actor.


πŸ“„ Outputs (Dataset / KV examples)

This Actor pushes one JSON object per video run to the default Dataset. Each object includes metadata and an array of comment objects.

Example output (single Dataset item)

{
"video_url": "https://www.youtube.com/watch?v=VIDEO_ID",
"video_id": "VIDEO_ID",
"totalComments": 1234,
"scrapedCount": 100,
"scrapeInfo": "100 (top 100 comments)",
"comments": [
{
"cid": "Ugz...",
"author": "SomeUser",
"text": "Great video!",
"votes": 42,
"time_text": "2 hours ago",
"time_unix": 1620000000,
"video_id": "VIDEO_ID",
"video_url": "https://..."
}
]
}

Micro-copy: results are ready to export or feed directly into your analytics pipeline.


πŸ”‘ Environment Variables (placeholders)

  • APIFY_TOKEN β€” required for API / CLI runs β€” use <APIFY_TOKEN> in examples.
  • APIFY_PROXY or HTTP_PROXY / HTTPS_PROXY β€” optional β€” set when using custom proxies. Use placeholders like <PROXY_USER:PASS@HOST:PORT>.

TODO: If your org requires additional secrets, store them in Apify Console's Secrets and reference them, not in plaintext.


▢️ How to Run

Console

  1. Open the Actor page in Apify Console.
  2. Paste a startUrls array into the Input editor.
  3. Click Run.

CLI

$apify run --input input.example.json

API (apify-client) See the Quick Start snippet above. Use client.actor('owner/actor-name').call(run_input=...) to start a run programmatically and fetch the dataset once finished.


⏰ Scheduling & Webhooks

  • Use Apify Console Scheduling to run this Actor on a cron (e.g., daily) for continuous monitoring.
  • Configure a webhook in Console to notify your service when a run finishes; then pull the Dataset for post-processing.

πŸ•ΎοΈ Logs & Troubleshooting

  • Check the run's logs in Apify Console for per-video progress and warnings.

  • Common issues:

    • No startUrls provided β€” ensure startUrls is set; the Actor exits with an error if missing.
    • Incomplete results β€” if getAllComments is true, large videos may take longer; consider maxComments for predictable runs.
    • Proxy/network failures β€” see Proxy Configuration below.

πŸ”’ Permissions & Storage Notes

  • This Actor collects public comments only. Ensure you comply with YouTube's Terms of Service and your organization's data policies.
  • Store credentials as Console Secrets. Do not place tokens or proxy credentials directly in the input.
  • The Actor writes results to the default Dataset and exits cleanly.

πŸ”Ÿ Changelog / Versioning

  • 1.0.0 β€” Initial release: stable, production-ready scraping of public comments with sorting and optional full harvesting.

πŸ–Œ Notes / TODOs

  • TODO: Add rate-limit/backoff configuration for very large runs (reason: avoid temporary network blocks).
  • TODO: Consider proxy rotation for large-scale scraping (reason: scale & reliability).

🌍 Proxy Configuration

If you run this Actor at scale or behind network restrictions, you can enable Apify Proxy or provide custom proxies.

  • Apify Console (one quick step): On the Actor run page, enable Use Apify Proxy.
  • Custom proxy (example): set proxyConfig in input or use environment variables:
{"proxyConfig": {"useApifyProxy": false, "customProxyUrl": "<PROXY_USER:PASS@HOST:PORT>"}}

Or set standard environment variables for processes run via CLI/API:

export HTTP_PROXY="http://<PROXY_USER:PASS@HOST:PORT>"
export HTTPS_PROXY="http://<PROXY_USER:PASS@HOST:PORT>"

Security note: Always store proxy credentials as Console Secrets β€” do not include them in plaintext input files.

TODO: Consider proxy rotation for large-scale scraping to reduce the chance of temporary blocks.


πŸ“š References

  • Apify Actors documentation (README & best practices)
  • Apify Input/Output schema guidance
  • Apify CLI & API usage docs

πŸ€” What I inferred from main.py

  • The Actor accepts startUrls (one or many YouTube video URLs) and iterates through them.
  • It supports maxComments (limit), getAllComments (boolean) and sortBy (top|newest|oldest).
  • Output is pushed to an Apify Dataset as JSON objects with video_id, scrapedCount, and a comments array.
  • The Actor performs network requests to collect public YouTube comments β€” use proxy settings for scale.

input.example.json

{
"startUrls": [
"https://www.youtube.com/watch?v=VIDEO_ID"
],
"maxComments": 100,
"getAllComments": false,
"sortBy": "top",
"proxyConfig": {"useApifyProxy": true}
}

CONFIG.md (optional)

Quick tips

  • For stable runs on large channels, enable useApifyProxy and schedule smaller batches.
  • Use maxComments for predictable run durations.
  • Store APIFY_TOKEN and any PROXY credentials as Console Secrets.

Contact For help integrating this Actor into pipelines, contact your platform administrator or the actor maintainer.