Youtube Comment Scraper
Pricing
$8.50/month + usage
Youtube Comment Scraper
π― **Instant YouTube Comment Insights!** Fetch thousands of comments in seconds β perfect for research, analytics, marketing & content strategy. π Zero setup, privacy-safe, enterprise-ready. π Discover what audiences *really think* β run now & get data instantly! β‘π
0.0 (0)
Pricing
$8.50/month + usage
0
1
1
Last modified
2 days ago
YouTube Comments Scraper
π YouTube Comments Scraper β Instant comment export for research, moderation, and analytics
π Summary A production-ready Apify Actor that fetches public comments from one or more YouTube video URLs and pushes structured results to an Apify Dataset. Plug-and-play in Apify Console β get organized comment data in seconds.
π‘ Use cases / When to use
- Export YouTube comments for sentiment analysis, moderation, user research, or training datasets.
- Rapidly collect top / newest / oldest comments for reporting or manual review.
- Integrate comment exports into larger Apify workflows or pipelines.
β‘ Quick Start (Console β one-click)
- Open this Actor in Apify Console.
- Paste one or more YouTube video URLs into the Input field (see Inputs below).
- Click Run β results land in the default Dataset within seconds.
βοΈ Quick Start (CLI + API)
apify-cli (run on Apify platform):
# Run with JSON input fileapify run --input input.example.json
apify-client (Python) β minimal snippet
from apify_client import ApifyClientclient = ApifyClient('<APIFY_TOKEN>')run = client.actor('your-username/your-actor-name').call(run_input={'startUrls': ['https://www.youtube.com/watch?v=VIDEO_ID'],'maxComments': 50,'getAllComments': False,'sortBy': 'top'})print('Run started:', run['id'])
π Inputs (fields & schema)
Provide a JSON object to the Actor. The most common field to set in Console is startUrls.
{"startUrls": ["https://www.youtube.com/watch?v=VIDEO_ID"],"maxComments": 100,"getAllComments": false,"sortBy": "top"}
Field summary
startUrlsβ array|string β Required β one or more YouTube video URLs.maxCommentsβ integer β Optional β maximum number of comments to return per video whengetAllCommentsis false.getAllCommentsβ boolean β Optional β iftrue, the Actor attempts to collect all available public comments for each video.sortByβ string β Optional βtop|newest|oldest. Defaults totop.
βοΈ Configuration
| π Name | π Type | β Required | βοΈ Default | π Example | π§ Notes |
|---|---|---|---|---|---|
| startUrls | array | β Yes | None | ["https://...v=VIDEOID"] | One or more YouTube video URLs |
| maxComments | integer | βοΈ Optional | null | 100 | Limit comments when getAllComments=false |
| getAllComments | boolean | βοΈ Optional | false | true | Set to true to attempt full harvest (may take longer) |
| sortBy | string | βοΈ Optional | "top" | "newest" | top / newest / oldest |
| proxyConfig | object | βοΈ Optional | {} | {"useApifyProxy": true} | Proxy settings β see Proxy Configuration |
Example Console setup:
Paste https://www.youtube.com/watch?v=VIDEO_ID into the Input field under startUrls, adjust maxComments if you like, then click Run Actor.
π Outputs (Dataset / KV examples)
This Actor pushes one JSON object per video run to the default Dataset. Each object includes metadata and an array of comment objects.
Example output (single Dataset item)
{"video_url": "https://www.youtube.com/watch?v=VIDEO_ID","video_id": "VIDEO_ID","totalComments": 1234,"scrapedCount": 100,"scrapeInfo": "100 (top 100 comments)","comments": [{"cid": "Ugz...","author": "SomeUser","text": "Great video!","votes": 42,"time_text": "2 hours ago","time_unix": 1620000000,"video_id": "VIDEO_ID","video_url": "https://..."}]}
Micro-copy: results are ready to export or feed directly into your analytics pipeline.
π Environment Variables (placeholders)
APIFY_TOKENβ required for API / CLI runs β use<APIFY_TOKEN>in examples.APIFY_PROXYorHTTP_PROXY/HTTPS_PROXYβ optional β set when using custom proxies. Use placeholders like<PROXY_USER:PASS@HOST:PORT>.
TODO: If your org requires additional secrets, store them in Apify Console's Secrets and reference them, not in plaintext.
βΆοΈ How to Run
Console
- Open the Actor page in Apify Console.
- Paste a
startUrlsarray into the Input editor. - Click Run.
CLI
$apify run --input input.example.json
API (apify-client)
See the Quick Start snippet above. Use client.actor('owner/actor-name').call(run_input=...) to start a run programmatically and fetch the dataset once finished.
β° Scheduling & Webhooks
- Use Apify Console Scheduling to run this Actor on a cron (e.g., daily) for continuous monitoring.
- Configure a webhook in Console to notify your service when a run finishes; then pull the Dataset for post-processing.
πΎοΈ Logs & Troubleshooting
-
Check the run's logs in Apify Console for per-video progress and warnings.
-
Common issues:
- No
startUrlsprovided β ensurestartUrlsis set; the Actor exits with an error if missing. - Incomplete results β if
getAllCommentsistrue, large videos may take longer; considermaxCommentsfor predictable runs. - Proxy/network failures β see Proxy Configuration below.
- No
π Permissions & Storage Notes
- This Actor collects public comments only. Ensure you comply with YouTube's Terms of Service and your organization's data policies.
- Store credentials as Console Secrets. Do not place tokens or proxy credentials directly in the input.
- The Actor writes results to the default Dataset and exits cleanly.
π Changelog / Versioning
- 1.0.0 β Initial release: stable, production-ready scraping of public comments with sorting and optional full harvesting.
π Notes / TODOs
- TODO: Add rate-limit/backoff configuration for very large runs (reason: avoid temporary network blocks).
- TODO: Consider proxy rotation for large-scale scraping (reason: scale & reliability).
π Proxy Configuration
If you run this Actor at scale or behind network restrictions, you can enable Apify Proxy or provide custom proxies.
- Apify Console (one quick step): On the Actor run page, enable Use Apify Proxy.
- Custom proxy (example): set
proxyConfigin input or use environment variables:
{"proxyConfig": {"useApifyProxy": false, "customProxyUrl": "<PROXY_USER:PASS@HOST:PORT>"}}
Or set standard environment variables for processes run via CLI/API:
export HTTP_PROXY="http://<PROXY_USER:PASS@HOST:PORT>"export HTTPS_PROXY="http://<PROXY_USER:PASS@HOST:PORT>"
Security note: Always store proxy credentials as Console Secrets β do not include them in plaintext input files.
TODO: Consider proxy rotation for large-scale scraping to reduce the chance of temporary blocks.
π References
- Apify Actors documentation (README & best practices)
- Apify Input/Output schema guidance
- Apify CLI & API usage docs
π€ What I inferred from main.py
- The Actor accepts
startUrls(one or many YouTube video URLs) and iterates through them. - It supports
maxComments(limit),getAllComments(boolean) andsortBy(top|newest|oldest). - Output is pushed to an Apify Dataset as JSON objects with
video_id,scrapedCount, and acommentsarray. - The Actor performs network requests to collect public YouTube comments β use proxy settings for scale.
input.example.json
{"startUrls": ["https://www.youtube.com/watch?v=VIDEO_ID"],"maxComments": 100,"getAllComments": false,"sortBy": "top","proxyConfig": {"useApifyProxy": true}}
CONFIG.md (optional)
Quick tips
- For stable runs on large channels, enable
useApifyProxyand schedule smaller batches. - Use
maxCommentsfor predictable run durations. - Store
APIFY_TOKENand anyPROXYcredentials as Console Secrets.
Contact For help integrating this Actor into pipelines, contact your platform administrator or the actor maintainer.
