Pricing

Pay per usage

Go to Apify Store

Reddit Advanced Scraper

Try for free

The Advanced Reddit Scraper is able to scrape reddit posts and comments and returns them in 1 output item per post.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Crazee Media

Actor stats

Bookmarked

Total users

Monthly active users

2 months ago

Last modified

Reddit Advanced Scraper - Any Subreddit (Anti-Rate-Limit)

Powerful Reddit scraper for any subreddit with advanced anti-rate-limit protection. Scrape posts and comments with full configurability.

✨ Features

📊 Scrape Any Subreddit - Choose any subreddit (AskReddit, technology, gaming, science, etc.)
💬 Full Comment Threads - Scrape nested comments with configurable depth
🛡️ Anti-Rate-Limit Protection - Automatic session rotation, adaptive delays, user agent switching
🎯 Fully Configurable - Control post count, comment depth, delays, and more
📤 Webhook Support - Send results to your webhook (n8n, Zapier, etc.)
📈 Real-Time Logging - Track progress with detailed logs
🔄 Adaptive Intelligence - Automatically adjusts delays based on rate limiting

🚀 Quick Start

Input Parameters

Parameter	Type	Default	Description
subreddit	string	AskReddit	Name of subreddit (without r/)
sort	enum	hot	Sort posts by: hot, new, top, rising
maxPosts	integer	1000	Maximum posts to scrape (1-10000)
maxComments	integer	50	Max comments per post (0-500)
maxDepth	integer	5	Comment nesting depth (0-10)
baseDelay	integer	6	Base delay between requests in seconds
sessionRotationRequests	integer	50	Rotate session every N requests
webhookUrl	string	null	Optional webhook URL for results
adaptiveDelays	boolean	true	Auto-adjust delays based on rate limiting

Example Configuration

{
  "subreddit": "technology",
  "sort": "hot",
  "maxPosts": 500,
  "maxComments": 100,
  "maxDepth": 5,
  "baseDelay": 6,
  "adaptiveDelays": true
}

📦 Output Format

Each scraped post includes:

{
  "title": "Post title",
  "author": "username",
  "subreddit": "technology",
  "score": 12345,
  "num_comments": 567,
  "created_utc": "2026-01-01T12:00:00Z",
  "url": "https://reddit.com/...",
  "permalink": "https://old.reddit.com/r/technology/...",
  "selftext": "Post body text",
  "is_self": true,
  "comments": [
    {
      "author": "commenter",
      "body": "Comment text",
      "score": 123,
      "depth": 0,
      "replies": [...]
    }
  ],
  "total_comments_scraped": 45
}

🛡️ Anti-Rate-Limit Features

1. Session Rotation

Automatically creates fresh sessions every 50 requests (configurable) with new identities.

2. Adaptive Delays

Starts with base delay (default 6s)
Increases exponentially when rate limited
Gradually decreases with successful requests
Adds random jitter to appear more human

3. User Agent Rotation

Rotates through 10+ different user agents:

Chrome (Windows, Mac, Linux)
Firefox (Windows, Mac, Linux)
Safari (Mac)

4. Exponential Backoff

Automatically backs off with increasing delays on failures (2s, 4s, 8s, 16s, etc.)

5. Smart Request Timing

Random jitter on all delays (±20%)
Configurable base delays
Respects Reddit's Retry-After headers

🎯 Use Cases

Market Research - Analyze sentiment and trends in specific communities
Academic Research - Collect data for social media studies
Community Analysis - Understand discussion patterns and popular topics
Content Discovery - Find trending posts and discussions
Data Collection - Build datasets for ML/AI training
Monitoring - Track specific subreddits for mentions or topics

💡 Tips for Best Results

Avoid Rate Limits

Use default delay of 6+ seconds
Enable adaptive delays
Keep session rotation at 50 requests
Don't scrape too many posts at once (stay under 1000)

Get More Data

Increase maxComments for deeper discussions
Increase maxDepth for nested reply chains
Use sort: "new" for latest content
Use sort: "top" for highest quality

Performance

Lower maxComments for faster scraping
Reduce maxDepth if you don't need deep threads
Increase baseDelay if you get rate limited

🔗 Webhook Integration

Send results to your webhook endpoint (n8n, Zapier, Make, etc.):

{
  "webhookUrl": "https://your-webhook.com/endpoint"
}

Webhook payload format:

{
  "timestamp": "2026-01-01T12:00:00Z",
  "scraper_type": "reddit",
  "action": "scrape_complete",
  "count": 500,
  "metadata": {
    "version": "2.0-apify",
    "source": "Apify Actor",
    "scraped_at": "2026-01-01T12:00:00Z",
    "count": 500
  },
  "data": [...]
}

Deploy to Apify

Log in to Apify:
```
$apify login
```
Deploy your Actor:
```
$apify push
```
Find it under Actors -> My Actors

⚠️ Important Notes

Reddit's old.reddit.com interface is used (more scraping-friendly)
No API authentication required (uses web scraping)
Respects Reddit's rate limits automatically
Data is saved to Apify dataset storage
Always check Reddit's Terms of Service for your use case

📊 Popular Subreddits to Scrape

r/AskReddit - Popular Q&A discussions
r/technology - Tech news and discussions
r/science - Scientific articles and comments
r/gaming - Gaming community discussions
r/news - Breaking news and comments
Any other public subreddit!

🔧 Technical Details

Built with: Python, BeautifulSoup, Requests, Apify SDK
Parsing: lxml parser for fast HTML processing
Session Management: Connection pooling for efficiency
Error Handling: Automatic retries with exponential backoff
Logging: Detailed progress tracking via Apify SDK

Version: 2.0 Built with advanced anti-rate-limit technology

Reddit Scraper

janbruinier/jan-reddit-scraper

Scrape posts and comments from Reddit

Jan Bruinier

Reddit scraper

curious_coder/reddit-scraper

Scrape reddit posts and comments from reddit search and communities

Curious Coder

327

Reddit Scraper

alex_claw/reddit-scraper

Alex Claw

Reddit Post Scrapper

dead00/reddit-post-scrapper

A Reddit post scraper is a tool or script that automatically collects data from Reddit posts—such as titles, content, comments.

Dead

Reddit API Scraper

comchat/reddit-api-scraper

Reddit Scraper is a powerful tool that allows you to extract data from Reddit such as posts by keyword. With Reddit Scraper, you can easily gather valuable information from Reddit without the need to log in. You can easily use this Reddit scraper as an alternative API.