
Reddit Comment Scraper Pro — Updated & Reliable
Pricing
$29.99/month + usage

Reddit Comment Scraper Pro — Updated & Reliable
Reddit Comment Scraper Pro — updated 2025 & Shreddit-DOM compatible. Scrape full threads (comments + nested replies) with depth, timestamps, authors & paths. Export CSV/JSON/XLSX. Apify-ready, headless, proxy support, auto-pagination, smart rate-limit handling. No API key.
5.0 (1)
Pricing
$29.99/month + usage
0
1
1
Last modified
2 days ago
Reddit Comment Scraper
Extract comments and replies from Reddit posts with full conversation threads preserved.
🎯 What does this actor do?
This actor scrapes comments from Reddit posts, capturing:
- All comments and nested replies in a tree structure
- Comment text, author, timestamp, and depth level
- Complete conversation threads with parent-child relationships
- Support for both top-level comments only or full reply chains
Perfect for:
- Market Research - Analyze customer opinions and feedback
- Sentiment Analysis - Understand community reactions
- Content Analysis - Study discussion patterns and trends
- Community Monitoring - Track conversations about your brand
📥 Input
Field | Type | Required | Default | Description |
---|---|---|---|---|
postUrls | Array | Yes | - | List of Reddit post URLs to scrape |
maxComments | Number | No | 500 | Maximum number of comments to collect |
includeReplies | Boolean | No | true | Whether to include nested replies |
sortBy | String | No | "top" | Comment sort order (top, new, best, controversial) |
Example Input
{"postUrls": ["https://www.reddit.com/r/technology/comments/example/discussion_thread/"],"maxComments": 100,"includeReplies": true,"sortBy": "top"}
📤 Output
The actor outputs one row per comment for clean Excel/CSV exports. Each comment (including replies) becomes a separate row with full context.
Output Format (Flattened for Easy Analysis)
Each row contains:
{"postUrl": "https://www.reddit.com/r/...","postTitle": "Post title here","postAuthor": "original_poster","postScore": "1234","subreddit": "technology","commentDepth": 0,"commentAuthor": "user1","commentText": "This is the comment text","commentTimestamp": "2024-01-15T10:30:00Z","commentId": "abc123","commentPath": "0","parentPath": null,"isTopLevel": true,"replyCount": 3,"scrapedAt": "2024-01-15T12:00:00Z"}
Output Fields Explained
Post Context (same for all comments from a post):
postUrl
- URL of the Reddit postpostTitle
- Title of the postpostAuthor
- Original post authorsubreddit
- Subreddit name
Comment Details:
commentDepth
- Nesting level (0 = top-level, 1 = first reply, etc.)commentAuthor
- Username of comment authorcommentText
- The actual comment contentcommentTimestamp
- When the comment was postedcommentId
- Reddit's comment ID
Threading Information:
commentPath
- Path showing position in thread (e.g., "0/2/1" = first comment → third reply → second sub-reply)parentPath
- Path to parent comment (null for top-level)isTopLevel
- Boolean indicating if it's a main commentreplyCount
- Number of direct replies to this comment
Example: How Threading Works
If a post has this structure:
Comment A (path: "0")└── Reply B (path: "0/0")└── Reply C (path: "0/0/0")Comment D (path: "1")└── Reply E (path: "1/0")
You'll get 5 rows in your export:
- Row for Comment A with
commentPath: "0"
,parentPath: null
- Row for Reply B with
commentPath: "0/0"
,parentPath: "0"
- Row for Reply C with
commentPath: "0/0/0"
,parentPath: "0/0"
- Row for Comment D with
commentPath: "1"
,parentPath: null
- Row for Reply E with
commentPath: "1/0"
,parentPath: "1"
Benefits of This Format
✅ Excel/CSV Ready - Each comment is a row, no nested JSON columns ✅ Easy Filtering - Filter by depth, author, or isTopLevel ✅ Preserved Threading - Use commentPath/parentPath to reconstruct conversations ✅ Analysis Friendly - Count comments per author, average reply depth, etc.
🚀 How to Use
Via Apify Console
- Add Reddit post URLs to the
postUrls
array - Set
maxComments
to limit data collection (useful for large threads) - Toggle
includeReplies
based on your needs:true
- Get full conversation threadsfalse
- Get only top-level comments
- Choose
sortBy
to control comment ordering - Click "Run" to start scraping
Via API
const input = {postUrls: ["https://www.reddit.com/r/AskReddit/comments/xyz/what_is_your_opinion/"],maxComments: 200,includeReplies: true,sortBy: "best"};// Run via Apify APIconst run = await client.actor("your-username/reddit-comment-scraper").call(input);
Processing Multiple Posts
You can scrape multiple posts in one run:
{"postUrls": ["https://www.reddit.com/r/technology/comments/post1/","https://www.reddit.com/r/science/comments/post2/","https://www.reddit.com/r/gaming/comments/post3/"],"maxComments": 100}
💡 Use Cases
Market Research
Analyze product discussions and customer feedback:
- Track mentions of your brand
- Understand customer pain points
- Identify feature requests
Sentiment Analysis
Study community reactions:
- Measure response to announcements
- Track opinion trends over time
- Identify influential commenters
Content Strategy
Understand what resonates:
- Find popular discussion topics
- Identify content gaps
- Study engagement patterns
⚠️ Limitations
- Works with new Reddit interface (reddit.com)
- Respects Reddit's rate limits
- Some very deeply nested threads may require clicking "Continue thread" links
- Deleted/removed comments are skipped
- Maximum runtime: 60 minutes per run
🔧 Advanced Features
Comment Filtering
- Set
includeReplies: false
to get only top-level comments - Use
maxComments
to limit data collection - Comments are sorted according to
sortBy
parameter
Data Structure
- Nested reply structure preserves conversation context
- Each comment includes depth level for easy filtering
- Timestamps allow temporal analysis
📊 Example Analysis
With the scraped data, you can:
- Count replies per comment to find most engaging topics
- Analyze comment depth to understand conversation complexity
- Track author participation across threads
- Build word clouds from comment text
- Perform sentiment analysis on discussions
🆘 Support
Having issues? Check these common solutions:
- No comments found - Verify the Reddit post URL is correct and public
- Timeout errors - Reduce
maxComments
for very large threads - Missing replies - Ensure
includeReplies
is set totrue
For additional support, please open an issue or contact support.
📝 Changelog
Version 1.0.0
- Initial release with Shreddit (new Reddit) support
- Full comment tree structure with nested replies
- Configurable comment limits and sorting
- Support for multiple posts in single run
On this page
Share Actor: