Reddit Scraper — Posts, Comments & Subreddit Data
Pricing
Pay per usage
Reddit Scraper — Posts, Comments & Subreddit Data
Extract Reddit posts, comments, and subreddit data via public API. Scrape titles, scores, authors, comment threads, dates, and flairs. Sort by hot, new, top, or rising. Perfect for market research, sentiment analysis, and content monitoring. No login required.
Pricing
Pay per usage
Rating
0.0
(0)
Developer

Ricardo Akiyoshi
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
2 hours ago
Last modified
Categories
Share
Reddit Thread & Comment Scraper
Scrape Reddit posts, comments, and subreddit data at scale using Reddit's public JSON API. No API keys or authentication required.
What does it do?
This actor extracts structured data from any public subreddit, including:
- Post data: title, score, author, comment count, flair, URL, self-text, creation date
- Comment threads: author, body, score, depth, creation date
- Sorting: hot, new, top (with time filters), rising
- Search: filter posts by keyword within any subreddit
- Pagination: automatically follows Reddit's pagination to collect large datasets
Use Cases
Market Research
Analyze what people are discussing in your industry. Track trending topics, pain points, and product feedback across relevant subreddits.
Sentiment Analysis
Collect posts and comments about your brand, product, or competitors. Feed the structured data into NLP pipelines for sentiment scoring.
Brand Monitoring
Monitor mentions of your brand or product keywords. Track sentiment shifts over time by scraping regularly.
Competitive Intelligence
See what users say about competing products. Identify feature gaps, complaints, and switching triggers.
Content Research
Find trending topics and popular content formats in your niche. Analyze what gets high engagement.
Academic Research
Collect large-scale social media datasets for research purposes with structured, clean output.
Input
| Field | Type | Default | Description |
|---|---|---|---|
subreddit | string | programming | Subreddit name (without r/ prefix) |
searchQuery | string | - | Optional keyword search within the subreddit |
sort | enum | hot | Sort order: hot, new, top, rising |
timeFilter | enum | week | Time range for top/search: hour, day, week, month, year, all |
maxPosts | integer | 50 | Maximum posts to scrape (1-1000) |
includeComments | boolean | false | Whether to fetch comments for each post |
maxCommentsPerPost | integer | 10 | Max comments per post (1-500) |
proxy | object | - | Optional proxy configuration |
Example Input
{"subreddit": "machinelearning","sort": "top","timeFilter": "month","maxPosts": 100,"includeComments": true,"maxCommentsPerPost": 20}
Search Example
{"subreddit": "webdev","searchQuery": "React vs Vue","sort": "top","timeFilter": "year","maxPosts": 50,"includeComments": true,"maxCommentsPerPost": 15}
Output
Each post is saved as a structured JSON object:
{"title": "GPT-5 just dropped and it's incredible","score": 4523,"upvoteRatio": 0.94,"numComments": 892,"author": "ai_researcher_42","subreddit": "machinelearning","url": "https://openai.com/blog/gpt-5","selfText": "","permalink": "/r/machinelearning/comments/abc123/gpt5_just_dropped/","fullUrl": "https://www.reddit.com/r/machinelearning/comments/abc123/gpt5_just_dropped/","createdAt": "2026-02-15T14:32:00.000Z","createdUtc": 1771339920,"flair": "Discussion","isNSFW": false,"isSelf": false,"domain": "openai.com","postId": "abc123","comments": [{"author": "deep_learning_fan","body": "The reasoning capabilities are genuinely impressive...","score": 1205,"createdAt": "2026-02-15T14:45:00.000Z","createdUtc": 1771340700,"depth": 0},{"author": "skeptical_dev","body": "Has anyone actually benchmarked this vs Claude?","score": 834,"createdAt": "2026-02-15T15:02:00.000Z","createdUtc": 1771341720,"depth": 1}],"scrapedAt": "2026-02-28T10:00:00.000Z"}
Rate Limiting
This actor respects Reddit's rate limits:
- ~60 requests per minute without authentication
- Automatic delays between requests (1-2 seconds)
- Exponential backoff on 429 (Too Many Requests) responses
- Proxy support for higher throughput on large scrapes
Pay Per Event Pricing
This actor uses Apify's Pay Per Event model. You are charged per post scraped:
| Event | Price |
|---|---|
| Post scraped (without comments) | $0.001 |
| Post scraped (with comments) | $0.003 |
Limitations
- Only public subreddits can be scraped
- Reddit's JSON API returns a maximum of ~1000 posts per listing
- Comments are limited to what Reddit returns in the default comment sort (best)
- Very large comment threads may be truncated by Reddit's API
- Rate limiting may slow down large scrapes (use proxy for best results)
Changelog
1.0.0 (2026-02-28)
- Initial release
- Subreddit post scraping with sort and time filters
- Comment extraction with depth tracking
- Search within subreddits
- Pagination support
- Rate limiting with exponential backoff
- Pay Per Event billing