Reddit Scraper
Pricing
$30.00/month + usage
Reddit Scraper
Batch-process hundreds of Reddit searches in one run. Each result includes your profileUrl for easy routing. Fast Mode saves 45% on costs. Perfect for social listening, brand monitoring, and research teams tracking multiple topics simultaneously.
Pricing
$30.00/month + usage
Rating
0.0
(0)
Developer

Yevhenii Molodtsov
Actor stats
1
Bookmarked
3
Total users
2
Monthly active users
5 days ago
Last modified
Categories
Share
Reddit Combined Scraper
Batch-process hundreds of Reddit searches in a single run. Built for teams that need to monitor multiple topics, brands, or people across Reddit without running separate scrapers.
Why This Scraper?
Most Reddit scrapers handle one search at a time. This one processes multiple queries in parallel and includes your routing identifier (profileUrl) in every result—so you can easily match posts back to the correct profile, topic, or campaign in your downstream systems.
Perfect for:
- Social listening platforms tracking multiple brands/celebrities
- Research teams monitoring various topics simultaneously
- Marketing agencies handling multiple client campaigns
- Anyone who needs to batch Reddit searches efficiently
Key Features
| Feature | Benefit |
|---|---|
| Bulk Query Processing | Run 100+ searches in one actor run instead of 100 separate runs |
| profileUrl Mirroring | Your identifier echoed in every result for easy routing |
| Fast Mode (45% Cheaper) | DOM extraction vs HTTP fetching - same data, lower cost |
| Crash Recovery | Posts saved incrementally, state checkpointed between queries |
| Migration Resilient | Resumes from checkpoint if Apify migrates the actor |
How profileUrl Mirroring Works
When you search for "Kim Kardashian" with profileUrl: "https://myapp.com/profiles/kim", every returned post includes that URL:
{"profileUrl": "https://myapp.com/profiles/kim","title": "Kim Kardashian's new business venture...","url": "https://reddit.com/r/entertainment/...",...}
This eliminates the need for post-processing joins—your pipeline immediately knows which profile each post belongs to.
Quick Start
{"queries": [{ "profileUrl": "https://myapp.com/kim", "searchQuery": "Kim Kardashian" },{ "profileUrl": "https://myapp.com/taylor", "searchQuery": "Taylor Swift" }],"searchTime": "week","maxItemsPerQuery": 50,"fastMode": true}
Input Parameters
Required
| Parameter | Type | Description |
|---|---|---|
queries | Array | List of {profileUrl, searchQuery} objects. Batch as many as you need. |
Search Options
| Parameter | Type | Default | Description |
|---|---|---|---|
searchTime | String | "all" | Time filter: hour, day, week, month, year, all |
sort | String | "new" | Sort: new, hot, relevance, top, comments |
includeNSFW | Boolean | true | Include adult content |
searchMode | String | "raw" | Query mode: raw, exact, and, or (see below) |
searchType | String | "posts" | Content type: posts (recommended) or all |
Search Mode Explained
| Mode | Input | Transformed Query | Use Case |
|---|---|---|---|
raw | Emily Miller | Emily Miller | Reddit default (implicit AND) |
exact | Emily Miller | "Emily Miller" | Exact phrase, prevents typo fixes |
and | Emily Miller | Emily AND Miller | Explicit AND (same as raw) |
or | Taylor Swift | Taylor OR Swift | Match any word |
Recommendation: Use exact mode for person names and usernames. It prevents Reddit's spell correction (e.g., nitpicknate → "nitpicknate" instead of being auto-corrected to "nitpick rate") and enables "no results" detection.
Results Control
| Parameter | Type | Default | Description |
|---|---|---|---|
maxItemsPerQuery | Number | 25 | Posts per query (1-500) |
maxItems | Number | — | Global cap across all queries |
Performance
| Parameter | Type | Default | Description |
|---|---|---|---|
fastMode | Boolean | true | 45% cheaper DOM extraction (recommended) |
maxConcurrency | Number | 3 | Parallel queries (2 for 1GB RAM, 3 for 2GB) |
Output Format
Each post is pushed individually to the dataset:
{"profileUrl": "https://myapp.com/kim","id": "t3_abc123","dataType": "post","title": "Kim Kardashian announces new skincare line","body": "Full post content here...","url": "https://www.reddit.com/r/entertainment/comments/abc123/...","communityName": "r/entertainment","username": "redditor123","createdAt": "2025-01-27T14:30:00.000Z","upVotes": 1542,"numberOfreplies": 234,"isNSFW": false}
Fast Mode vs Standard Mode
| Aspect | Fast Mode | Standard Mode |
|---|---|---|
| Cost | 45% cheaper | Baseline |
| Speed | ~60 posts/min | ~30 posts/min |
| Post Body | May be truncated for very long posts | Always complete |
| Data Completeness | 95%+ identical | 100% |
| Recommended For | Most use cases | When you need full text of long posts |
How Fast Mode Saves Money
Standard Mode: Browser scrolls search → Collects URLs → HTTP fetches each post individually → Full data extraction
Fast Mode: Browser scrolls search → Extracts data directly from DOM → Done
By skipping the HTTP fetch phase, Fast Mode reduces both compute time and network overhead by approximately 45%.
Real-World Cost Comparison
Based on actual test runs (January 2025):
| Scenario | Mode | Time | Cost |
|---|---|---|---|
| 3 queries, 25 posts each | Fast | 75s | $0.0021 |
| 3 queries, 25 posts each | Standard | 137s | $0.0038 |
| 10 queries, 50 posts each | Fast | ~4 min | ~$0.035 |
| 50 queries, 25 posts each | Fast | ~15 min | ~$0.11 |
Bottom line: Fast Mode delivers the same results at nearly half the cost.
Memory Recommendations
| Workload | Memory | Concurrency | Notes |
|---|---|---|---|
| 1-5 queries | 1024 MB | 2 | Minimum viable |
| 5-15 queries | 2048 MB | 3 | Recommended |
| 15-50 queries | 2048 MB | 2-3 | Lower concurrency for stability |
| 50+ queries | 4096 MB | 3 | Large batch jobs |
Proxy Requirements
Residential proxies are required. Reddit aggressively blocks datacenter IPs.
The default configuration uses Apify's residential proxy group, which works out of the box. If you use custom proxies, ensure they're residential.
Error Handling & Recovery
This scraper is built for reliability:
- Incremental Saving: Each post is pushed to the dataset immediately after extraction. If the actor crashes mid-query, you keep everything collected so far.
- State Checkpoints: After each query completes, state is saved. If Apify migrates the actor, it resumes from the last checkpoint.
- Rate Limit Handling: Built-in delays and exponential backoff prevent Reddit blocks.
- 80% Failure Threshold: If 80%+ of requests fail, the actor stops early to save costs.
Limitations
- HTTP Fetch Limit: Standard mode fetches up to ~100 posts per query in the HTTP phase. For more posts, results come from browser extraction.
- Fast Mode Text: Very long post bodies (1000+ characters) may be truncated in Fast Mode. Switch to Standard Mode if you need complete text for long-form content.
- Rate Limits: Reddit may throttle aggressive scraping. The scraper handles this automatically, but very large jobs may take longer.
Example Use Cases
Social Listening Platform
Monitor mentions of 50 celebrities across Reddit, routing each mention to the correct profile:
{"queries": [{ "profileUrl": "https://platform.com/celeb/1", "searchQuery": "Taylor Swift" },{ "profileUrl": "https://platform.com/celeb/2", "searchQuery": "Bad Bunny" }// ... 48 more],"searchTime": "day","maxItemsPerQuery": 25,"fastMode": true}
Brand Monitoring
Track your brand and competitors:
{"queries": [{ "profileUrl": "brand:ours", "searchQuery": "\"Acme Corp\" OR \"Acme Inc\"" },{ "profileUrl": "brand:competitor1", "searchQuery": "\"Big Corp\"" },{ "profileUrl": "brand:competitor2", "searchQuery": "\"Other Inc\"" }],"searchTime": "week","sort": "relevance","maxItemsPerQuery": 100}
Research Data Collection
Gather posts about specific topics for analysis:
{"queries": [{ "profileUrl": "topic:ai", "searchQuery": "artificial intelligence" },{ "profileUrl": "topic:ml", "searchQuery": "machine learning" },{ "profileUrl": "topic:llm", "searchQuery": "large language models" }],"searchTime": "month","sort": "top","maxItemsPerQuery": 200,"fastMode": false}
Development
npm install # Install dependenciesapify run # Run locally with Apify proxiesnpm test # Run testsnpm run format # Format code
Changelog
- v1.0 - Initial release with Fast Mode, bulk queries, profileUrl mirroring, and migration persistence
License
MIT
Questions? Open an issue on the GitHub repository.