Reddit MCP Scraper
Pricing
$5.00 / 1,000 results
Reddit MCP Scraper
Unified Reddit scraper supporting 3 modes: (1) Subreddit posts with content extraction, (2) Post comments with threading, (3) User profiles with metadata. Extract comprehensive data including scores, timestamps, flairs, NSFW flags, and more.
5.0 (3)
Pricing
$5.00 / 1,000 results
0
2
2
Last modified
4 days ago
Reddit MCP Server
A unified Apify MCP (Model Context Protocol) server for comprehensive Reddit scraping. This actor provides a single interface to scrape subreddits, comments, and user profiles using browser automation with Playwright.
π Features
Multi-Mode Scraping
This MCP server supports three scraping modes:
- Subreddit Mode - Scrape posts from Reddit subreddits
- Comments Mode - Scrape comments from Reddit posts
- Profile Mode - Scrape user profiles and their posts
Key Capabilities
β
Unified Interface - Single actor for all Reddit scraping needs
β
Browser Automation - Bypasses API restrictions using Playwright
β
No Authentication Required - Scrape public content without login
β
Comprehensive Data - Extract all relevant fields and metadata
β
Automatic Pagination - Load multiple pages automatically
β
NSFW Support - Automatically handles NSFW confirmation dialogs
β
Structured Output - Clean JSON data ready for AI consumption
π Input Parameters
Common Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
mode | string | Yes | Scraping mode: subreddit, comments, or profile |
Subreddit Mode Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
subreddits | array | - | List of subreddit names (without 'r/' prefix) |
maxPosts | integer | 25 | Maximum posts per subreddit (1-1000) |
sort | string | "hot" | Sort method: hot, new, top, controversial |
timeFilter | string | "day" | Time filter for top/controversial (hour/day/week/month/year/all) |
Comments Mode Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
postUrls | array | - | List of Reddit post URLs to scrape |
maxComments | integer | 100 | Maximum comments per post (1-10000) |
expandThreads | boolean | true | Automatically expand collapsed threads |
Profile Mode Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
usernames | array | - | List of Reddit usernames (without 'u/' prefix) |
maxPosts | integer | 100 | Maximum posts per user (1-1000) |
section | string | "submitted" | Profile section: submitted, overview, gilded |
sort | string | "new" | Sort method: hot, new, top, controversial |
π Input Examples
Example 1: Scrape Subreddits
{"mode": "subreddit","subreddits": ["python", "programming", "webdev"],"maxPosts": 50,"sort": "hot","timeFilter": "day"}
Example 2: Scrape Comments
{"mode": "comments","postUrls": ["https://www.reddit.com/r/programming/comments/1abc123/interesting_discussion/","https://old.reddit.com/r/python/comments/1def456/another_post/"],"maxComments": 200,"expandThreads": true}
Example 3: Scrape User Profiles
{"mode": "profile","usernames": ["spez", "example_user"],"maxPosts": 100,"section": "submitted","sort": "top"}
π Output Format
Subreddit Mode Output
Each post includes:
{"subreddit": "python","subreddit_prefixed": "r/python","post_id": "1abc123","post_name": "t3_1abc123","title": "Interesting Python discussion","author": "example_user","selftext": "Post content preview...","score": 456,"num_comments": 89,"url": "https://old.reddit.com/r/python/comments/...","permalink": "https://old.reddit.com/r/python/comments/...","domain": "self.python","is_self_post": true,"link_flair": "Discussion","thumbnail_url": null,"created_utc": 1747683628,"created_at": "2025-10-31T12:30:00","is_stickied": false,"is_locked": false,"is_nsfw": false}
Comments Mode Output
Each comment includes:
{"comment_id": "abc123xyz","comment_name": "t1_abc123xyz","author": "example_user","text": "This is a great discussion!","score": 42,"awards_count": 2,"permalink": "https://old.reddit.com/r/...","post_url": "https://old.reddit.com/r/...","depth": 0,"parent_comment_id": null,"is_op": false,"is_edited": true,"is_stickied": false,"created_utc": 1728912645,"created_at": "2025-10-31T12:30:45"}
Profile Mode Output
Profile data with posts:
{"username": "spez","post_karma": 0,"comment_karma": 0,"total_karma": 1047690,"account_created": "2005-06-06T04:00:00+00:00","posts": [{"post_id": "abc123","title": "Announcing new features","author": "spez","subreddit": "announcements","score": 15234,"num_comments": 1250,"url": "https://old.reddit.com/...","created_at": "2025-10-31T12:30:45","is_stickied": true,"is_nsfw": false}]}
π― Use Cases
Research & Analysis
- Sentiment Analysis - Analyze community opinions across subreddits
- Trend Detection - Track emerging topics and discussions
- User Behavior - Study posting patterns and engagement
- Content Analysis - Build datasets for machine learning
Business Intelligence
- Market Research - Gather user feedback and discussions
- Brand Monitoring - Track mentions and sentiment
- Competitive Analysis - Monitor competitor discussions
- Customer Insights - Understand customer needs and pain points
AI & ML Applications
- Training Data - Build high-quality datasets for AI models
- RAG Systems - Feed Reddit content to retrieval systems
- Chatbot Training - Use conversations for dialogue models
- Content Generation - Analyze successful content patterns
π οΈ Local Development
Prerequisites
pip install -r requirements.txtplaywright install chromium
Create Input File
Create storage/key_value_stores/default/INPUT.json:
{"mode": "subreddit","subreddits": ["python"],"maxPosts": 10}
Run Locally
cd Reddit/mcpapify run
Check Results
Results are saved in storage/datasets/default/
π Deployment
Using Apify CLI
# Login to Apifyapify login# Push to Apify platformapify push
Manual Upload
- Create a new actor on Apify Console
- Upload all files including
Dockerfile,requirements.txt, and.actor/directory - Configure input parameters
- Run the actor
π API Integration
JavaScript/Node.js
const { ApifyClient } = require("apify-client");const client = new ApifyClient({ token: "YOUR_API_TOKEN" });const input = {mode: "subreddit",subreddits: ["python", "programming"],maxPosts: 50,sort: "hot",};const run = await client.actor("YOUR_ACTOR_ID").call(input);const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(`Scraped ${items.length} posts`);
Python
from apify_client import ApifyClientclient = ApifyClient('YOUR_API_TOKEN')input_data = {'mode': 'subreddit','subreddits': ['python', 'programming'],'maxPosts': 50,'sort': 'hot'}run = client.actor('YOUR_ACTOR_ID').call(run_input=input_data)for item in client.dataset(run['defaultDatasetId']).iterate_items():print(f"Post: {item['title']}")print(f"Score: {item['score']}")
β‘ Performance Tips
Optimize Speed
- Start with lower
maxPostsvalues for testing - Use specific subreddits instead of scraping all posts
- Disable
expandThreadsin comments mode if not needed - Process fewer URLs/usernames per run
Avoid Rate Limiting
- Add delays between requests (built-in)
- Don't scrape the same content repeatedly
- Respect Reddit's servers - use reasonable limits
- Consider batching requests across multiple runs
β οΈ Limitations
- Public Content Only - Cannot scrape private subreddits or profiles
- No Authentication - Requires public access to content
- Rate Limits - Reddit may throttle excessive requests
- Browser-Based - Slower than direct API but more reliable
- Dynamic Content - Some features may change if Reddit updates layout
π Troubleshooting
No Results Returned
- Verify subreddit/username/URL is correct
- Check if content is public (not private/restricted)
- Try with smaller
maxPostsvalues first - Review logs for specific error messages
Timeout Errors
- Content may be loading slowly
- Try with fewer items or smaller limits
- Check if Reddit is accessible from your location
Missing Data Fields
- Some fields may be null if not available
- Deleted content shows "[deleted]" for authors
- Hidden scores may show as 0
π License
This actor is provided as-is for scraping public Reddit data in accordance with Reddit's terms of service.
π Related Actors
- ../reddit/ - Dedicated subreddit scraper
- ../reddit-comment/ - Dedicated comment scraper
- ../reddit-profile/ - Dedicated profile scraper
π‘ Notes
- This MCP server uses browser automation to access Reddit's public interface
- Always respect Reddit's robots.txt and terms of service
- Use responsibly and avoid overwhelming Reddit's servers
- Consider implementing additional rate limiting for large-scale scraping
- The actor works best with the Apify platform's infrastructure
π Support
For issues, questions, or feature requests, please open an issue in the repository or contact support.
Made with β€οΈ for the AI community | Powered by Apify
On this page
Share Actor:
