Reddit Scraper
Pricing
from $0.01 / 1,000 results
Reddit Scraper
Extract Reddit posts, comments, users & communities by keyword search or subreddit. OAuth authentication, NSFW filtering, trending tracker. No login required. Perfect for brand monitoring, market research & sentiment analysis. Export to CSV/JSON/Excel.
Pricing
from $0.01 / 1,000 results
Rating
0.0
(0)
Developer

JINGYUAN LIANG
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
17 days ago
Last modified
Categories
Share
๐ฅ Reddit Scraper for Apify
A comprehensive Reddit scraper that extracts posts, comments, user profiles, and trending content from Reddit. Perfect for social media analysis, content research, and data mining.
๐ Quick Start: See ./QUICK-START.md for immediate setup with examples
โ OAuth Ready: Authentication tested and working! See ./AUTHENTICATION.md
โจ Features
- ๐ Subreddit Post Scraper - Extract posts from any subreddit with sorting options
- ๐ฌ Comment Thread Extractor - Get all comments from specific posts (nested structure preserved)
- ๐ค User Profile Scraper - Extract karma, post history, and user information
- ๐ฅ Trending Posts Tracker - Scrape viral content from r/popular or r/all
- ๐ OAuth Authentication - Optional Reddit API authentication for higher rate limits
- โก Smart Rate Limiting - Built-in delays to respect Reddit's API limits
- ๐ Flexible Export - Save data to Apify dataset (CSV, JSON, Excel)
๐ Quick Start
Authentication Options
No API Key Required! The scraper works out of the box using Reddit's public JSON API.
Optional: Use Reddit OAuth for higher rate limits and better reliability.
- ๐ See ./AUTHENTICATION.md for details
- Already have credentials? Just set
useAuthentication: truein your input!
Mode 1: Subreddit Post Scraper
Scrape posts from any subreddit:
{"scrapeMode": "subreddit","subreddit": "programming","sortBy": "hot","maxPosts": 100,"includeComments": false}
Options:
sortBy:hot,new,top,risingtimeFilter:hour,day,week,month,year,all(for "top" sort)includeComments: Set totrueto fetch comments for each post
Example Output:
{"type": "post","id": "abc123","title": "Learn Python in 2024","author": "johndoe","subreddit": "programming","score": 1234,"num_comments": 45,"url": "https://www.reddit.com/r/programming/...","created_utc": "2024-01-15T10:30:00.000Z","selftext": "Here's my guide...","upvote_ratio": 0.95}
Mode 2: Comment Thread Extractor
Extract all comments from a specific post:
{"scrapeMode": "comments","postUrl": "https://www.reddit.com/r/programming/comments/abc123/post_title/","maxComments": 200}
Example Output:
{"type": "comment","id": "xyz789","author": "commenter1","body": "Great post! Here's my thoughts...","score": 56,"depth": 0,"created_utc": "2024-01-15T11:00:00.000Z","permalink": "https://www.reddit.com/...","is_submitter": false}
Mode 3: User Profile Scraper
Get user karma, posts, and comment history:
{"scrapeMode": "user","username": "spez","maxPosts": 50}
Example Output:
{"type": "user_profile","username": "spez","link_karma": 12345,"comment_karma": 67890,"total_karma": 80235,"created_utc": "2005-06-06T00:00:00.000Z","is_gold": true,"is_employee": true}
Mode 4: Trending Posts Tracker
Scrape viral content from r/popular:
{"scrapeMode": "trending","sortBy": "hot","maxPosts": 100}
๐ Input Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
scrapeMode | String | โ | Mode: subreddit, comments, user, or trending |
subreddit | String | For subreddit mode | Subreddit name (without r/) |
username | String | For user mode | Reddit username (without u/) |
postUrl | String | For comments mode | Full URL of Reddit post |
sortBy | String | โ | Sort posts by: hot, new, top, rising |
timeFilter | String | โ | Time filter for top posts |
maxPosts | Integer | โ | Maximum posts to scrape (default: 100) |
maxComments | Integer | โ | Maximum comments per post (default: 50) |
includeComments | Boolean | โ | Include comments in subreddit scrape |
useAuthentication | Boolean | โ | Use Reddit OAuth (higher rate limits) |
redditClientId | String | For auth | Reddit app client ID |
redditClientSecret | String | For auth | Reddit app client secret |
redditUserAgent | String | โ | Custom user agent |
๐ฏ Use Cases
1. Market Research
{"scrapeMode": "subreddit","subreddit": "startups","sortBy": "top","timeFilter": "week","maxPosts": 200}
2. Sentiment Analysis
{"scrapeMode": "comments","postUrl": "https://www.reddit.com/r/technology/comments/...","maxComments": 500}
3. Influencer Analysis
{"scrapeMode": "user","username": "popular_redditor","maxPosts": 100}
4. Content Discovery
{"scrapeMode": "trending","sortBy": "hot","maxPosts": 50}
๐ ๏ธ Local Development
Prerequisites
- Node.js 18+
- npm or yarn
Installation
# Install dependenciesnpm install# Quick test (no Apify needed)npm test# Run locally with Apify CLIapify run# Or with Node.jsnpm start
Testing with Input
Create input.json in the apify_storage/key_value_stores/default/ folder:
{"scrapeMode": "subreddit","subreddit": "webdev","sortBy": "hot","maxPosts": 10}
Then run:
$npm start
๐ Deployment
Quick Deploy to Apify
# Install Apify CLInpm install -g apify-cli# Login to Apifyapify login# Deploy your actorapify push
๐ Full deployment guide: See ./DEPLOYMENT.md for detailed instructions on:
- Deploying via CLI, GitHub, or Web Interface
- Running scheduled scrapes
- Setting up webhooks and integrations
- Monitoring and troubleshooting
- Cost optimization tips
๐ Data Structure
Post Object
type: "post"id: Reddit post IDtitle: Post titleauthor: Username of authorsubreddit: Subreddit namescore: Upvotes minus downvotesnum_comments: Number of commentsurl: Reddit permalinklink_url: External link (if any)selftext: Post body textcreated_utc: ISO timestampupvote_ratio: Percentage of upvotesis_video: Booleanover_18: NSFW flag
Comment Object
type: "comment"id: Comment IDauthor: Usernamebody: Comment textscore: Comment scoredepth: Nesting level (0 = top-level)created_utc: ISO timestamppermalink: Direct link to commentparent_id: Parent comment/post ID
User Profile Object
type: "user_profile"username: Reddit usernamelink_karma: Post karmacomment_karma: Comment karmatotal_karma: Combined karmacreated_utc: Account creation dateis_gold: Reddit Gold statusis_mod: Moderator status
โ๏ธ Configuration
Rate Limiting
The scraper includes built-in rate limiting (2 seconds between requests) to respect Reddit's guidelines. If rate limited (429 error), it automatically waits 10 seconds and retries.
Proxy Support
Use Apify's proxy configuration in the input:
{"scrapeMode": "subreddit","subreddit": "example","proxyConfiguration": {"useApifyProxy": true}}
๐จ Important Notes
- Reddit Terms of Service: Use responsibly and respect Reddit's ToS
- Rate Limits: Reddit has rate limits - the scraper handles this automatically
- Authentication: This scraper uses Reddit's JSON API (no auth required)
- Data Volume: Large scrapes may take time due to rate limiting
- NSFW Content: Filter using the
over_18field if needed
๐ Performance
- Speed: ~2-3 seconds per request (due to rate limiting)
- Scale: Can scrape 1000+ posts in a single run
- Memory: Efficient - streams data to dataset
๐ค Contributing
Feel free to submit issues and enhancement requests!
๐ License
ISC License - feel free to use for personal or commercial projects.
๐ Resources
๐ก Tips & Tricks
Get Multiple Subreddits
Run the actor multiple times with different inputs, or modify the code to accept an array of subreddits.
Export to Google Sheets
Use Apify's integration to automatically export scraped data to Google Sheets.
Schedule Regular Scrapes
Set up a schedule in Apify to monitor subreddits daily/weekly for trending content.
Combine with Other Tools
- Export to CSV for Excel analysis
- Use with Python/Pandas for data science
- Feed into AI/ML models for NLP tasks
- Create dashboards with Tableau/Power BI
Happy Scraping! ๐
For questions or support, please open an issue in the repository.