Reddit Scraper
Pricing
from $1.95 / 1,000 result scrapeds
Reddit Scraper
Scrape Reddit posts, comments, profiles, and subreddits without API key. Full comment threading, media extraction (video, gallery, embed), flair, awards. NSFW filtering. Hot/new/top/rising sort and search.
Pricing
from $1.95 / 1,000 result scrapeds
Rating
0.0
(0)
Developer

junipr
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
9 hours ago
Last modified
Categories
Share
Scrape Reddit posts, comments, user profiles, and subreddit metadata without a Reddit API key. Features full comment tree threading, media extraction, NSFW filtering, and configurable content filters. Works out of the box with zero configuration — just provide a subreddit name or Reddit URL and start collecting data.
Why Use This Over Reddit API
Reddit deprecated its free API tier in 2023. The official API now costs $0.24 per 1,000 API calls with a $100/month minimum commitment, plus requires OAuth application setup and rate limit quota management.
This actor bypasses the API entirely using web scraping. No API key, no OAuth, no monthly subscription. Additional features the Reddit API does not provide: media URL resolution for external hosts (imgur, gfycat), NSFW content filtering with safe defaults, and structured flair/award/crosspost data that competitors ignore.
| Feature | This Actor | epctex Reddit Scraper | Reddit API (Official) |
|---|---|---|---|
| API key required | No | No | Yes (OAuth) |
| Monthly subscription | No | No | $100/mo minimum |
| Comment threading | Full tree structure | Flat list | Full tree |
| Media extraction | Video, gallery, embed, external | Images only | URLs only |
| NSFW filtering | Configurable (off by default) | No | Basic |
| Rate limit handling | Auto backoff + rotation | Manual retry | Built-in quotas |
| Cost per 1K posts | $1.90 | ~$4.00 | $0.24 (API calls only) |
How to Use
Zero-config example — scrape hot posts from r/technology:
{"scrapeType": "posts","subreddits": ["technology"],"maxResults": 50}
Search Reddit for brand mentions:
{"scrapeType": "search","searchQuery": "your brand name","sort": "new","maxResults": 200}
Get all comments on a specific post:
{"scrapeType": "comments","urls": ["https://www.reddit.com/r/AskReddit/comments/abc123/your_post_title/"]}
Scrape top posts from this week with comment trees:
{"scrapeType": "posts","subreddits": ["technology", "programming"],"sort": "top","timeFilter": "week","maxResults": 100,"includeComments": true,"maxCommentsPerPost": 50}
Scrape subreddit metadata:
{"scrapeType": "subreddit","subreddits": ["technology", "programming", "webdev"]}
Input Configuration
| Parameter | Type | Default | Description |
|---|---|---|---|
scrapeType | string | "posts" | What to scrape: posts, comments, subreddit, userProfile, search |
urls | string[] | [] | Direct Reddit URLs (posts, subreddits, users) |
searchQuery | string | — | Search query (required for search type) |
subreddits | string[] | [] | Subreddit names without r/ |
sort | string | "hot" | Sort: hot, new, top, rising, controversial |
timeFilter | string | "all" | Time filter for top/controversial: hour, day, week, month, year, all |
maxResults | integer | 100 | Maximum results (1–100,000) |
includeComments | boolean | true | Include comments when scraping posts |
maxCommentsPerPost | integer | 100 | Max comments per post (0–10,000) |
commentDepth | integer | 10 | Max reply nesting depth (1–20) |
includeNsfw | boolean | false | Include NSFW content |
minScore | integer | — | Minimum post/comment score |
excludeAuthors | string[] | ["AutoModerator", "[deleted]"] | Authors to exclude |
flattenComments | boolean | false | Flatten comment trees to a flat list |
proxyConfiguration | object | Apify residential | Proxy settings |
Output Format
Each post includes full metadata, media, and optionally a threaded comment tree:
{"id": "t3_abc123","title": "Example Post Title","body": "Post body text in markdown...","author": "reddit_user","subreddit": "technology","score": 1542,"upvoteRatio": 0.94,"numComments": 231,"createdUtc": "2024-01-15T10:30:00.000Z","isNsfw": false,"media": {"type": "image","images": [{ "url": "https://...", "width": 1920, "height": 1080, "caption": null }],"video": null,"embed": null},"comments": [{"id": "t1_xyz789","author": "commenter","body": "Great post!","score": 45,"depth": 0,"isOp": false,"replies": [{"id": "t1_abc456","body": "Thanks!","depth": 1,"replies": []}]}]}
Comments maintain full tree structure with replies arrays. Set flattenComments: true for a flat list with depth and parentId fields for CSV/spreadsheet use.
Tips and Best Practices
Rate limiting: Reddit aggressively rate-limits scrapers. The default requestDelay of 1000ms is a safe baseline. Lower values increase speed but risk 429 errors. The actor handles rate limits automatically with exponential backoff.
Proxy selection: Reddit blocks datacenter proxies. Use residential proxies (the default) for reliable scraping. The actor rotates IPs on rate limit responses.
Large subreddits: For subreddits with millions of posts, use sort: "top" with a timeFilter to get manageable result sets. Scraping all of r/AskReddit would take days — filter first.
NSFW filtering: NSFW content is excluded by default. Set includeNsfw: true to include it. When scraping NSFW subreddits with the default setting, zero results are returned.
Comment threading: Comments are returned as a tree by default. Each comment has a replies array containing child comments. Use flattenComments: true if you need a flat structure for data analysis.
Pricing
Pay-Per-Event: $1.90 per 1,000 results.
Pricing includes all platform compute costs — no hidden fees.
A "result" is one successfully scraped item pushed to the dataset. A post counts as 1 result. Each comment counts as 1 result. A subreddit info object counts as 1 result.
| Scenario | Items | Cost |
|---|---|---|
| 100 posts, no comments | 100 | $0.19 |
| 100 posts + avg 20 comments each | 2,100 | $3.99 |
| Brand monitoring (500 mentions/week) | 500 | $0.95/week |
| Subreddit metadata (10 subreddits) | 10 | $0.02 |
| Full thread (1 post + 5K comments) | 5,001 | $9.50 |
Items filtered out by score, NSFW, author, or flair filters are NOT billed. Failed requests are NOT billed.
FAQ
Does this require a Reddit API key?
No. This actor scrapes Reddit's web interface directly. No API key, no OAuth setup, no Reddit developer account needed. It works out of the box.
How does it handle Reddit's rate limiting?
The actor uses exponential backoff when Reddit returns 429 (Too Many Requests) responses. It starts with a 5-second delay and increases to 60 seconds, with random jitter to avoid thundering herd patterns. Proxy IPs are rotated on rate limit hits.
Can I scrape private subreddits?
No. Private subreddits require Reddit account authentication, which this actor does not support. The actor detects private subreddits and reports them as errors without wasting retries.
How do I filter NSFW content?
NSFW content is excluded by default (includeNsfw: false). Set includeNsfw: true to include it. When scraping a known NSFW subreddit with the default setting, you will get zero results.
Does it work for non-English subreddits?
Yes. The actor supports full UTF-8 content including emoji, CJK characters, and other scripts. Content is preserved exactly as posted on Reddit.
Can I download images and videos?
The actor extracts media URLs (images, videos, galleries) by default. Direct download to the Apify key-value store is not currently supported — use the extracted URLs with a separate download tool.
What happens if Reddit blocks the scraper?
The actor detects anti-bot pages and retries with different proxy IPs. With residential proxies (the default), blocks are rare. If all retries fail, the URL is logged in the run summary's failedUrls array.
How are comments structured?
Comments are returned as a tree by default. Each comment has a replies array containing nested child comments, preserving Reddit's thread structure. The depth field indicates nesting level (0 = top-level). Set flattenComments: true for a flat list.