Reddit Comments Scraper
Pricing
Pay per usage
Reddit Comments Scraper
Extract detailed comments and discussion threads from Reddit instantly. Perfect for sentiment analysis, market research, and community monitoring. Get structured data from any post URL efficiently. Residential proxies are recommended for high-volume scraping stability.
Pricing
Pay per usage
Rating
5.0
(2)
Developer
Shahid Irfan
Maintained by CommunityActor stats
1
Bookmarked
30
Total users
4
Monthly active users
a day ago
Last modified
Categories
Share
Extract Reddit post comments into a clean dataset with author, scoring, threading, flair, moderation, and post context fields. It is designed for research, monitoring, moderation analysis, and discussion intelligence on public Reddit threads.
Features
- Thread-wide comment capture — Collect top-level comments and nested replies from public Reddit posts
- Rich comment context — Save author, score, timestamps, thread depth, flair, moderation flags, and post metadata
- Duplicate-safe output — Merge repeated comment records so each comment ID appears once in the final dataset
- Cleaned missing values — Handle deleted authors, removed bodies, and sparse Reddit fields gracefully
- Configurable collection size — Stop after the number of comments defined by
results_wanted
Use Cases
Community Research
Study how people respond to prompts, news, or product discussions. Build datasets for qualitative analysis or trend tracking.
Moderation Analysis
Review stickied comments, locked discussions, controversial replies, and other moderation-related signals in one place.
NLP and Sentiment Work
Collect structured discussion text with reply depth, timestamps, and score data for downstream language analysis.
Competitive Monitoring
Track how Reddit communities talk about brands, topics, launches, or public events over time.
Input Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
startUrl | String | Yes | — | Reddit thread URL to collect comments from |
results_wanted | Integer | No | 20 | Maximum number of unique comments to save |
proxyConfiguration | Object | No | Apify Residential Proxy | Proxy settings for more reliable collection |
Output Data
Each dataset item contains comment content plus thread and author context.
| Field | Type | Description |
|---|---|---|
id | String | Reddit comment ID |
comment_fullname | String | Full Reddit thing name for the comment |
author | String | Comment author or [deleted] when unavailable |
author_fullname | String | Reddit fullname for the author when available |
body | String | Comment text or [removed] when unavailable |
body_html | String | Comment body in Reddit HTML format when available |
score | Number | Comment score |
ups | Number | Upvote count reported by Reddit |
downs | Number | Downvote count reported by Reddit |
created_utc | Number | Unix timestamp from Reddit |
created_at | String | ISO timestamp derived from created_utc |
edited | Boolean or Number | Edit flag or edit timestamp |
depth | Number | Reply depth within the thread |
parent_id | String | Parent thing ID |
parent_comment_id | String | Parent comment ID when the parent is another comment |
parent_type | String | Parent Reddit thing type such as t1 or t3 |
link_id | String | Fullname of the parent post |
permalink | String | Absolute Reddit URL for the comment |
subreddit | String | Subreddit name |
subreddit_id | String | Subreddit ID |
subreddit_name_prefixed | String | Prefixed subreddit name such as r/AskReddit |
subreddit_type | String | Subreddit visibility type |
post_id | String | Parent post ID |
post_title | String | Parent post title |
post_author | String | Parent post author |
post_permalink | String | Parent post URL |
is_submitter | Boolean | Whether the comment author created the post |
distinguished | String | Moderator or admin distinction when present |
stickied | Boolean | Whether the comment is stickied |
locked | Boolean | Whether the comment is locked |
archived | Boolean | Whether the comment is archived |
collapsed | Boolean | Whether the comment is collapsed |
controversiality | Number | Reddit controversiality score |
score_hidden | Boolean | Whether Reddit hides the score |
gilded | Number | Legacy gild count |
total_awards_received | Number | Total awards on the comment |
all_awardings_count | Number | Number of award entries present |
author_premium | Boolean | Reddit premium flag for the author |
author_is_blocked | Boolean | Whether the author is blocked |
author_flair_text | String | Author flair text |
author_flair_type | String | Flair type |
author_flair_text_color | String | Flair text color |
author_flair_background_color | String | Flair background color |
comment_type | String | Comment classification when provided |
treatment_tags | Array | Reddit treatment tags when present |
retrieval_source | String | Whether the record came from the main listing or deferred expansion |
source_url | String | Final verified thread URL used for collection |
Usage Examples
Basic Thread Extraction
{"startUrl": "https://www.reddit.com/r/AskReddit/comments/1pqgcx9/whats_the_most_unexpected_way_someone_you_know/","results_wanted": 20}
Larger Collection
{"startUrl": "https://www.reddit.com/r/webscraping/comments/1qs66k0/couldnt_find_proxy_directory_with_filters_so/","results_wanted": 100}
With Proxy Configuration
{"startUrl": "https://www.reddit.com/r/AskReddit/comments/1pqgcx9/whats_the_most_unexpected_way_someone_you_know/","results_wanted": 50,"proxyConfiguration": {"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"]}}
Sample Output
{"id": "nuuw9it","comment_fullname": "t1_nuuw9it","author": "Agua_Frecuentemente","author_fullname": "t2_mobpn32j","body": "The guy who invented Smartfood (popcorn) lives in my town and we have some mutual friends.","score": 2851,"ups": 2851,"downs": 0,"created_utc": 1766150326,"created_at": "2025-12-19T18:25:26.000Z","depth": 0,"parent_id": "t3_1pqgcx9","parent_type": "t3","link_id": "t3_1pqgcx9","permalink": "https://www.reddit.com/r/AskReddit/comments/1pqgcx9/whats_the_most_unexpected_way_someone_you_know/nuuw9it/","subreddit": "AskReddit","subreddit_id": "t5_2qh1i","subreddit_name_prefixed": "r/AskReddit","post_id": "1pqgcx9","post_title": "What’s the most unexpected way someone you know became wealthy?","post_author": "xFaith","post_permalink": "https://www.reddit.com/r/AskReddit/comments/1pqgcx9/whats_the_most_unexpected_way_someone_you_know/","is_submitter": false,"stickied": false,"locked": false,"archived": false,"collapsed": false,"controversiality": 0,"score_hidden": false,"gilded": 0,"total_awards_received": 0,"author_premium": false,"author_is_blocked": false,"author_flair_type": "text","retrieval_source": "listing","source_url": "https://www.reddit.com/r/AskReddit/comments/1pqgcx9/whats_the_most_unexpected_way_someone_you_know/?solution=..."}
Tips for Best Results
Use Working Reddit Thread URLs
- Use direct thread URLs rather than subreddit feeds or search pages
- Prefer public threads with active discussions
Start Small
- Begin with
results_wanted: 20for quick validation - Increase the limit after confirming the thread loads correctly
Use Proxies for Reliability
- Residential proxies can help maintain stable runs
- If a thread is region-sensitive or intermittently blocked, rerun with proxy support enabled
Integrations
Connect your dataset with:
- Google Sheets — Review discussion metrics in spreadsheets
- Airtable — Build searchable discussion databases
- Slack — Send comment updates into team workflows
- Make — Automate processing pipelines
- Zapier — Trigger downstream business actions
Export Formats
- JSON — For application workflows and data pipelines
- CSV — For spreadsheets and quick reviews
- Excel — For business reporting
- XML — For system integrations
Frequently Asked Questions
How many comments can I collect?
You can collect up to the number defined by results_wanted, as long as the thread contains that many unique comments.
Are nested replies included?
Yes. Replies are collected along with top-level comments, and their position in the thread is preserved through depth and parent fields.
Are duplicate comments removed?
Yes. Records are keyed by comment ID, and duplicate appearances are merged into one final dataset item.
What happens when a field is missing?
Sparse Reddit fields are handled gracefully. Deleted authors and removed bodies are normalized so the dataset stays usable.
Does this work on private communities?
No. The actor is intended for public Reddit threads only.
Legal Notice
Use this actor only for legitimate data collection and analysis. You are responsible for complying with Reddit terms, rate limits, and applicable laws in your jurisdiction.