Reddit Comment Scraper
Pricing
$19.99/month + usage
Reddit Comment Scraper
Pricing
$19.99/month + usage
Rating
0.0
(0)
Developer
Scraply
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
A powerful Apify Actor for scraping comments from Reddit posts. Extract comments, replies, author information, upvotes, and more from any Reddit post URL.
Why Choose Us?
- Comprehensive Data Extraction: Get all comment details including author, content, upvotes, permalinks, and nested replies
- Smart Proxy Management: Automatic fallback from direct connection to datacenter to residential proxies
- Bulk Processing: Process multiple Reddit post URLs simultaneously
- Flexible Configuration: Customize sort order, comment limits, and reply depth
- Reliable & Fast: Built with async/await for optimal performance
Key Features
- ✅ Extract comments from Reddit posts
- ✅ Support for nested replies (configurable depth)
- ✅ Multiple sort orders (hot, new, top, controversial, old)
- ✅ Automatic proxy fallback (no proxy → datacenter → residential)
- ✅ Bulk URL processing
- ✅ Detailed logging and progress tracking
- ✅ Structured JSON output
Input
JSON Example
{"startUrls": [{"url": "https://www.reddit.com/r/ChatGPT/comments/1epeshq/these_are_all_ai/"}],"maxComments": 50,"sortOrder": "hot","replyLimit": 1,"proxyConfiguration": {"useApifyProxy": false}}
Input Fields
- startUrls (required): Array of Reddit post URLs to scrape
- maxComments (optional, default: 50): Maximum number of comments to fetch per URL (1-10000)
- sortOrder (optional, default: "hot"): How to sort comments - "hot", "new", "top", "controversial", or "old"
- replyLimit (optional, default: 1): Maximum number of replies to extract per comment (1-100)
- proxyConfiguration (optional): Proxy settings. By default, no proxy is used. If Reddit blocks requests, the actor automatically falls back to datacenter then residential proxies.
Output
The actor outputs data in two formats:
- Dataset: Individual comment records with URL reference
- Key-Value Store: Grouped comments by URL (matching the original output.json format)
Output Format
Each comment includes:
url: The Reddit post URLcomment_id: Unique comment identifierpost_id: Post identifierauthor: Comment author usernamepermalink: Direct link to the commentupvotes: Number of upvotescontent_type: Type of content (usually "text")parent_id: Parent comment ID (if it's a reply)author_avatar: Author avatar URL (if available)userUrl: Link to user's Reddit profilecontentText: The comment text contentcreated_time: Timestamp (if available)replies: Array of nested replies (if any)
Example Output
{"https://www.reddit.com/r/ChatGPT/comments/1epeshq/these_are_all_ai/": [{"comment_id": "lhk1f7n","post_id": "t3_1epeshq","author": "AutoModerator","permalink": "https://www.reddit.com/r/ChatGPT/comments/1epeshq/these_are_all_ai/lhk1f7n/","upvotes": 1,"content_type": "text","parent_id": "t3_1epeshq","author_avatar": "","userUrl": "https://www.reddit.com/user/AutoModerator/","contentText": "Comment text here...","created_time": "","replies": []}]}
🚀 How to Use the Actor (via Apify Console)
- Log in at https://console.apify.com and go to Actors
- Find reddit-comment-scraper and click it
- Configure inputs:
- Add Reddit post URLs in the
startUrlsfield - Set
maxComments(default: 50) - Choose
sortOrder(default: "hot") - Set
replyLimit(default: 1) - Configure proxy settings if needed
- Add Reddit post URLs in the
- Click Run to start the actor
- Monitor logs in real time
- Access results in the OUTPUT tab
- Export results to JSON or CSV
Best Use Cases
- Market Research: Analyze user opinions and discussions on specific topics
- Sentiment Analysis: Collect comments for sentiment analysis
- Content Aggregation: Gather comments for content creation or research
- Community Monitoring: Track discussions in specific subreddits
- Data Analysis: Extract structured data for statistical analysis
Frequently Asked Questions
Q: Can I scrape comments from private subreddits?
A: No, this actor only scrapes publicly available content from Reddit.
Q: What happens if Reddit blocks my requests?
A: The actor automatically falls back through proxy options: first tries direct connection, then datacenter proxy, then residential proxy with retries.
Q: How many comments can I scrape per URL?
A: You can set maxComments from 1 to 10,000 per URL.
Q: Does the actor respect Reddit's rate limits?
A: Yes, the actor includes delays between requests to be respectful to Reddit's API.
Q: Can I scrape nested replies?
A: Yes, use the replyLimit parameter to control how many nested replies to extract per comment.
Support and Feedback
For issues, questions, or feedback, please contact support or create an issue in the actor repository.
Cautions
- Data is collected only from publicly available sources
- No data is taken from private accounts or password-protected content
- The end user is responsible for ensuring legal compliance (spam laws, privacy, data protection, etc.)
- Please respect Reddit's Terms of Service and rate limits