Reddit Scraper
Pricing
Pay per usage
Pricing
Pay per usage
Rating
0.0
(0)
Developer

Jan Bruinier
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 hours ago
Last modified
Categories
Share
Scrape posts and comments from any subreddit. Uses Reddit's public JSON API -- no API key or login required.
What it does
This actor pulls posts from any public subreddit with full metadata: scores, authors, timestamps, flairs, and more. Optionally fetches the entire comment tree for each post with configurable depth. Supports sorting by hot/new/top/rising and searching within subreddits.
Use cases
- Market research: Track what people are saying about your product or industry on Reddit.
- Content research: Find trending topics and discussions in any niche.
- Sentiment analysis: Collect posts and comments for NLP analysis.
- Competitor monitoring: Set up scheduled runs to track mentions of competitor brands.
- Academic research: Gather discussion data for social science or linguistics studies.
- Lead generation: Find people asking for recommendations in your product category.
Input options
| Parameter | Description | Default |
|---|---|---|
| Subreddit | Subreddit name without "r/" prefix | Required |
| Sort Order | hot, new, top, or rising | hot |
| Timeframe | Time filter for "top" sort (hour/day/week/month/year/all) | week |
| Search Query | Search within the subreddit | Empty (no search) |
| Include Comments | Also scrape comments for each post | No |
| Max Comment Depth | How deep to go into reply chains (1-10) | 3 |
| Max Posts | Number of posts to scrape (up to 1000) | 25 |
Output format
Posts:
{"_type": "post","post_id": "abc123","title": "What's the best Python web framework in 2024?","author": "pythondev42","subreddit": "python","score": 342,"upvote_ratio": 0.95,"num_comments": 89,"created_utc": "2024-01-15T10:30:00+00:00","selftext": "I've been using Flask for years but...","url": "https://reddit.com/r/python/comments/abc123/...","permalink": "https://www.reddit.com/r/python/comments/abc123/...","flair": "Discussion","is_self": true,"is_video": false,"domain": "self.python","awards_count": 2}
Comments (when enabled):
{"_type": "comment","comment_id": "xyz789","post_id": "abc123","author": "django_fan","body": "Django is still the best for larger projects...","score": 156,"created_utc": "2024-01-15T11:45:00+00:00","depth": 0,"parent_id": "t3_abc123","is_submitter": false,"awards_count": 1}
How it works
The actor appends .json to Reddit URLs to get structured data without needing the official API. It handles pagination automatically and respects Reddit's rate limits with built-in delays and retry logic.
Rate limits
Reddit rate-limits unauthenticated JSON requests. The actor handles this automatically:
- 1 second delay between post page fetches
- 1.5 second delay between comment fetches
- Automatic retry with backoff on 429 responses
For large scrapes (100+ posts with comments), runs may take several minutes.
Tips
- Start without comments to quickly scan posts, then re-run with comments for the ones you care about.
- Use the search feature to find niche discussions within large subreddits.
- Sort by "top" with timeframe "all" to get the most popular posts of all time.
- Schedule hourly runs with "new" sort to catch every post in a subreddit.