Reddit Scraper Pro
Pricing
Pay per event
Reddit Scraper Pro
High-performance Reddit scraper (99%+ success rate) for automation workflows. Monitor subreddits, track keywords with sentiment analysis, scrape comments, and integrate with n8n/Zapier for powerful automation.
5.0 (1)
Pricing
Pay per event
0
2
2
Last modified
19 hours ago
Reddit Scraper Pro - Posts, Comments & Sentiment Analysis
High-performance Reddit scraper (99%+ success rate) for automation workflows. Monitor subreddits, track keywords with sentiment analysis, scrape comments, and integrate with n8n/Zapier for powerful automation.
π Why Choose Reddit Scraper Pro?
| Feature | Reddit Scraper Pro | trudax/reddit-scraper | Other Reddit Scrapers | 
|---|---|---|---|
| Success Rate | 99%+ | 69-92% | 70-90% | 
| User Rating | 4.5+ (target) | 2.50-2.66 | 2.5-3.5 | 
| State Management | β No duplicates across runs | β No | β No | 
| Webhook Alerts | β Built-in n8n/Zapier | β No | β οΈ Limited | 
| Sentiment Analysis | β AFINN-165 NLP | β No | β No | 
| Comment Scraping | β With depth control | β οΈ Limited | β οΈ Limited | 
| Pricing Model | β $0.001/post (predictable) | β οΈ Varied | β οΈ Compute-time | 
| Scheduled Runs | β Optimized for automation | β οΈ Limited | β οΈ Limited | 
β¨ Key Features
- π― 99%+ Success Rate - Smart rate limiting, exponential backoff, residential proxy support
 - π State Management - No duplicate posts across scheduled runs (tracks 10k recent post IDs)
 - π Sentiment Analysis - Built-in NLP using AFINN-165 (positive/negative/neutral classification)
 - π Webhook Integration - Direct n8n/Zapier support for real-time alerts
 - π¬ Comment Scraping - Configurable depth (0-5 levels) for deep analysis
 - π° Predictable Pricing - $0.001 per post ($1 per 1,000 posts) - no surprise costs
 - β‘ Incremental Scraping - Only fetch new posts since last run for efficient scheduled automation
 
π― Use Cases
Brand Monitoring with Sentiment Analysis
{"searchMode": "keyword","keywords": ["YourBrand", "YourProduct"],"subreddits": ["webdev", "SaaS"],"analyzeSentiment": true,"webhookUrl": "https://n8n.io/webhook/reddit-alerts","maxAgeHours": 24}
Result: Daily alerts when your brand is mentioned, with sentiment context (positive/negative/neutral).
Subreddit Monitoring for New Posts
{"searchMode": "subreddit","subreddits": ["webdev", "programming", "javascript"],"maxItemsPerSubreddit": 100,"minUpvotes": 10,"maxAgeHours": 24}
Result: Daily scrape of high-quality posts (10+ upvotes) from last 24 hours. Schedule every 24 hours.
Competitor Tracking Across Reddit
{"searchMode": "keyword","keywords": ["Competitor1", "Competitor2"],"subreddits": ["webdev", "SaaS", "startups"],"minUpvotes": 5,"analyzeSentiment": true,"includeComments": true,"commentDepth": 1}
Result: Track competitor mentions with engagement metrics, sentiment, and top-level comments.
Trend Analysis with Comments
{"searchMode": "keyword","keywords": ["AI", "ChatGPT", "GPT-4", "Claude"],"maxItemsPerSubreddit": 500,"analyzeSentiment": true,"includeComments": true,"commentDepth": 2}
Result: Deep analysis of AI trends with comment discussions (2 levels deep).
π₯ Input Configuration
Quick Start (Default Settings)
Just click "Start" to test with default configuration:
{"searchMode": "keyword","keywords": ["AI", "ChatGPT", "web scraping"],"subreddits": ["webdev", "programming", "technology"],"maxItemsPerSubreddit": 50}
Search Modes
1. π Keyword Search (recommended for automation) Search all of Reddit (or specific subreddits) for keywords:
{"searchMode": "keyword","keywords": ["Apify", "web scraping"],"subreddits": ["webdev", "programming"]}
Leave subreddits empty to search all of Reddit.
2. π Subreddit Monitoring Monitor specific subreddits (filter by keywords if provided):
{"searchMode": "subreddit","subreddits": ["webdev", "javascript"],"keywords": ["Apify"]}
Essential Parameters
| Parameter | Type | Default | Description | 
|---|---|---|---|
searchMode | enum | "keyword" | keyword or subreddit | 
keywords | array | ["AI", "ChatGPT", ...] | Keywords to search/filter | 
subreddits | array | ["webdev", ...] | Subreddits to monitor (without "r/") | 
maxItemsPerSubreddit | integer | 50 | Max posts per subreddit/query (1-10,000) | 
minUpvotes | integer | 5 | Minimum upvotes filter (quality control) | 
maxAgeHours | integer | 168 | Max post age in hours (168 = 7 days) | 
includeComments | boolean | false | Fetch comments (increases runtime) | 
commentDepth | integer | 1 | Comment reply depth (0-5 levels) | 
analyzeSentiment | boolean | true | Enable sentiment analysis | 
webhookUrl | string | - | POST results to this URL | 
useApifyProxy | boolean | true | REQUIRED - Use residential proxies | 
proxyGroups | array | ["RESIDENTIAL"] | Proxy type (keep as RESIDENTIAL) | 
Advanced Filtering
Control data quality with filters:
{"minUpvotes": 10,"maxAgeHours": 24,"includeComments": true,"commentDepth": 2}
π€ Output Schema
Each post includes 43 fields with comprehensive metadata:
{"id": "01HQZX9K3P2VQWE8RTGBNM4567","platform": "reddit","type": "post","reddit_id": "abc123","subreddit": "webdev","subreddit_prefixed": "r/webdev","title": "Amazing new web scraping tool","body": "Just discovered Apify and it's incredible...","author": "username","url": "https://apify.com","reddit_url": "https://www.reddit.com/r/webdev/comments/...","score": 156,"upvotes": 180,"upvote_ratio": 0.88,"num_comments": 42,"gilded": 2,"sentiment_score": 5.2,"sentiment_comparative": 0.35,"sentiment_label": "positive","keywords_matched": ["Apify"],"created_utc": "2025-10-16T10:00:00Z","flair": "Discussion","comments": [...],"ingest_meta": {"first_seen_at": "2025-10-16T11:00:00Z","scrape_run_id": "...","actor_run_id": "..."}}
Sentiment Analysis Fields
- sentiment_score: AFINN score (-5 to +5, 0 = neutral)
 - sentiment_comparative: Normalized score per word
 - sentiment_label: 
positive,negative, orneutral 
Example Sentiment Interpretation:
sentiment_score: 5.2, label: "positive"β Very positive postsentiment_score: -3.8, label: "negative"β Negative sentimentsentiment_score: 0.5, label: "neutral"β Neutral/mixed
π Dataset Views in Apify Console
The actor provides 3 optimized views:
- Overview - Main view with 8 key fields (title, subreddit, author, score, comments, sentiment, date, link)
 - Engagement & Sentiment - Combined metrics for deeper insights (score, upvote ratio, awards, sentiment)
 - Full Details - Complete dataset with all 43 fields for advanced analysis
 
π Webhook Integration
n8n Workflow Example
- Create webhook trigger in n8n
 - Configure Reddit Scraper Pro:
 
{"searchMode": "keyword","keywords": ["YourBrand"],"webhookUrl": "https://your-n8n.com/webhook/reddit","analyzeSentiment": true}
- Process in n8n:
- Filter by sentiment: 
sentiment_label === "negative" - Send negative mentions to Slack/email
 - Store all results in Airtable/Google Sheets
 - Create follow-up tasks in ClickUp/Asana
 
 - Filter by sentiment: 
 
Zapier Workflow Example
Use Zapier's "Webhook by Zapier" trigger with the webhookUrl configuration.
Webhook Payload (Batch Mode)
Results are sent at the end of each run:
{"type": "batch","timestamp": "2025-10-16T11:00:00Z","runId": "abc123","posts": [{ /* post 1 */ },{ /* post 2 */ }],"stats": {"totalPosts": 42,"totalComments": 180,"keywordMatches": 38,"sentimentDistribution": {"positive": 15,"negative": 8,"neutral": 19}}}
π Incremental Scraping with State Management
Reddit Scraper Pro tracks seen posts across runs:
- β No Duplicates - Already-seen posts are automatically skipped
 - β Efficient - Only fetches new content since last run
 - β Scalable - Keeps last 10,000 post IDs in state
 - β Perfect for Scheduled Runs - Hourly, daily, weekly automation
 
How it works:
- First run: Scrapes 100 posts β Saves IDs to state
 - Second run: Fetches posts β Skips 80 already seen β Processes 20 new posts
 - State persists across all runs indefinitely
 
π° Pricing
Pay-Per-Event Model: $0.001 per post ingested
Cost Examples:
- 100 posts = $0.10
 - 1,000 posts = $1.00
 - 10,000 posts = $10.00
 
Plus: Apify platform costs:
- Compute time (minimal - ~$0.01-0.05 per run)
 - Residential proxy bandwidth (~$2-5 per GB)
 
Cost Optimization Tips:
- Use 
maxItemsPerSubredditto limit scraping - Set 
maxAgeHoursto 24 for daily monitoring - Use 
minUpvotesfilter (focus on quality content) - Disable 
includeCommentsunless needed (significantly reduces runtime) 
β οΈ Critical: Residential Proxies Required
Reddit blocks datacenter IPs. You MUST use residential proxies:
{"useApifyProxy": true,"proxyGroups": ["RESIDENTIAL"]}
Why?
- Datacenter proxies get instant 403/429 errors
 - Residential proxies mimic real users β 99%+ success rate
 - This is non-negotiable for reliability
 
Cost: Residential proxies cost $2-5 per GB (separate from per-event pricing, billed by Apify).
π Rate Limiting & Reliability
Reddit's unofficial JSON API has rate limits:
- ~60 requests per minute
 - Smart rate limiting (1.2 second delays)
 - Exponential backoff on 429 errors (automatic)
 - X-Ratelimit headers tracked automatically
 
Best Practices:
- Don't run multiple instances simultaneously
 - Schedule runs at least 10 minutes apart
 - Use state management to avoid reprocessing
 
π Recommended Scheduling
Brand Monitoring
- Frequency: Every 6-12 hours
 - Config: 
maxAgeHours: 12,analyzeSentiment: true,webhookUrl: "..." 
Subreddit Monitoring
- Frequency: Daily (every 24 hours)
 - Config: 
maxAgeHours: 24,minUpvotes: 5 
Trend Analysis
- Frequency: Weekly
 - Config: 
maxAgeHours: 168,maxItemsPerSubreddit: 500,includeComments: true 
Competitor Tracking
- Frequency: Daily
 - Config: 
maxAgeHours: 24,analyzeSentiment: true,webhookUrl: "..." 
β FAQ
How do I scrape Reddit without an API key?
This scraper uses Reddit's unofficial JSON API (e.g., /r/subreddit.json), which doesn't require authentication or API keys. No Reddit account needed!
Can I use this with n8n or Zapier?
Yes! Enable the webhookUrl parameter to send results directly to n8n, Zapier, or Make. The scraper posts a JSON payload with all results and stats at the end of each run.
How accurate is the sentiment analysis?
The sentiment analysis uses AFINN-165, a research-validated lexicon with ~2,500 words. Accuracy is typically 70-80% for social media text. Best for detecting overall positive/negative/neutral trends rather than nuanced emotion.
What's the difference between keyword and subreddit mode?
- Keyword mode searches all of Reddit (or specific subreddits) for posts matching keywords
 - Subreddit mode monitors specific subreddits and optionally filters by keywords
 
For brand monitoring, use keyword mode. For community monitoring, use subreddit mode.
Can I schedule this to run automatically?
Yes! Use Apify's built-in scheduler or integrate with n8n/Zapier for custom schedules. We recommend running every 6-12 hours for brand monitoring, or daily for subreddit monitoring.
Will I get duplicate posts across runs?
No! The state management system tracks the last 10,000 post IDs seen. On subsequent runs, already-seen posts are automatically skipped. This is critical for scheduled automation.
Why do I need residential proxies?
Reddit aggressively blocks datacenter IPs (including Apify's infrastructure). Without residential proxies, you'll get instant 403/429 errors. Residential proxies cost $2-5/GB but are essential for reliability.
How many posts can I scrape per run?
Technically unlimited, but Reddit's API typically returns ~1,000 posts per subreddit. Use maxItemsPerSubreddit to control volume and cost.
Can I scrape private subreddits?
No, the unofficial JSON API only accesses public subreddits. Private/quarantined subreddits require authentication via Reddit's official API.
What format is the output?
JSON by default, but you can export to CSV, Excel, HTML, or XML. The output is flat (not deeply nested) for easy import to Google Sheets, databases, or automation tools.
π οΈ API Integration
Using Apify API (cURL)
curl -X POST https://api.apify.com/v2/acts/YOUR_USERNAME~reddit-scraper-pro/runs \-H "Authorization: Bearer YOUR_API_TOKEN" \-H "Content-Type: application/json" \-d '{"searchMode": "keyword","keywords": ["Apify"],"maxItemsPerSubreddit": 50}'
Using Apify JavaScript Client
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });const run = await client.actor('YOUR_USERNAME/reddit-scraper-pro').call({searchMode: 'keyword',keywords: ['Apify'],maxItemsPerSubreddit: 50,});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(items);
Using Apify Python Client
from apify_client import ApifyClientclient = ApifyClient('YOUR_API_TOKEN')run = client.actor('YOUR_USERNAME/reddit-scraper-pro').call(run_input={'searchMode': 'keyword','keywords': ['Apify'],'maxItemsPerSubreddit': 50,})items = client.dataset(run['defaultDatasetId']).list_items().itemsprint(items)
π§ͺ Testing & Development
Quick Test with Default Input
Just click "Start" or use apify call with no input:
$apify call YOUR_USERNAME/reddit-scraper-pro
Local Development
cd reddit-scraper-pronpm installnpm run build# Test with pay-per-event billingACTOR_TEST_PAY_PER_EVENT=true ACTOR_USE_CHARGING_LOG_DATASET=true npm run dev
π Limitations
- Rate Limits: Reddit enforces ~60 requests/minute (handled automatically)
 - Historical Data: Limited by Reddit's API (typically ~1,000 posts per subreddit)
 - Private Subreddits: Cannot access private/quarantined subreddits
 - Deleted Content: Cannot retrieve deleted posts/comments
 - Residential Proxies Required: Datacenter IPs are blocked by Reddit
 - Unofficial API: Uses unofficial JSON endpoints (may break if Reddit changes)
 
π€ Support & Contact
- Email: kontakt@barrierefix.de
 - Issues: Report bugs or request features via GitHub Issues
 - Documentation: This README + inline code comments
 
π‘ Success Story
Use Case: "Alert me when r/webdev mentions 'Apify'"
Setup:
{"searchMode": "keyword","keywords": ["Apify"],"subreddits": ["webdev"],"webhookUrl": "https://n8n.io/webhook/reddit-alerts","analyzeSentiment": true,"maxAgeHours": 12}
Schedule: Every 6 hours
Result: Instant Slack notifications when Apify is mentioned in r/webdev, with sentiment context (positive/negative/neutral). Perfect for brand monitoring and community engagement.
π Explore More of Our Actors
π¬ Social Media & Community
| Actor | Description | 
|---|---|
| Discord Scraper Pro | Extract Discord messages and chat history for community insights | 
| YouTube Comments Harvester | Comprehensive YouTube comments scraper with channel-wide enumeration | 
| Alt Network Bundle | Scrape Bluesky and Mastodon posts for alternative social media research | 
π E-commerce & Brand Monitoring
| Actor | Description | 
|---|---|
| Shopify Scraper Pro | Extract Shopify product data for competitor and brand monitoring | 
| Etsy Scraper Pro | Fast Etsy product scraper with ratings and reviews | 
| Amazon Reviews Scraper | Extract Amazon customer reviews for sentiment analysis | 
π’ Business Intelligence
| Actor | Description | 
|---|---|
| Indeed Salary Analyzer | Get salary data for compensation benchmarking and market research | 
| Crunchbase Scraper | Extract company data and funding information for business intelligence | 
π·οΈ SEO Keywords
Reddit scraper, Reddit data extraction, scrape Reddit posts, Reddit API alternative, Reddit comment scraper, Reddit sentiment analysis, Reddit brand monitoring, Reddit automation, extract Reddit data, Reddit web scraping, Reddit post scraper, Reddit data mining, Reddit competitor analysis, Reddit keyword tracking, Reddit subreddit monitor, how to scrape Reddit, Reddit market research, Reddit trend analysis, Reddit automation tool, n8n Reddit integration, Zapier Reddit scraper, Make Reddit automation
Built with β€οΈ by Barrierefix | Powered by Apify | LICENSE
On this page
- 
Reddit Scraper Pro - Posts, Comments & Sentiment Analysis
- 
- How do I scrape Reddit without an API key?
 - Can I use this with n8n or Zapier?
 - How accurate is the sentiment analysis?
 - What's the difference between keyword and subreddit mode?
 - Can I schedule this to run automatically?
 - Will I get duplicate posts across runs?
 - Why do I need residential proxies?
 - How many posts can I scrape per run?
 - Can I scrape private subreddits?
 - What format is the output?
 
 
 
Share Actor:
