Pricing

from $3.00 / 1,000 reddit posts

Reddit Intelligence Scraper

Reddit is one of the largest real-time sources of consumer opinions, trends, and product feedback. Reddit Intelligence Scraper is an advanced Apify Actor built to turn Reddit into a powerful business, research, and growth-hacking intelligence engine.

Pricing

from $3.00 / 1,000 reddit posts

Rating

0.0

(0)

Developer

charith wijesundara

Actor stats

Bookmarked

Total users

Monthly active users

2 days ago

Last modified

Features

Multi-source scraping: Subreddits, search results, and user profiles
Full comment trees: Extract nested comment threads
AI-powered analysis: Sentiment, topic extraction, entity recognition (via LlamaIndex + OpenAI)
Anti-ban system: Proxy rotation, session pooling, CAPTCHA detection, human-like delays
Playwright support: JavaScript rendering for dynamic content
LangGraph orchestration: Agentic crawl decision-making

Input Schema

{
  "subreddits": ["entrepreneur", "startups"],
  "keywords": ["stripe", "shopify", "saas"],
  "users": ["spez"],
  "maxPosts": 100,
  "maxCommentsPerPost": 50,
  "sort": "hot",
  "time": "week",
  "minScore": 10,
  "includeNSFW": false,
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": ["RESIDENTIAL"]
  },
  "enablePlaywright": true,
  "enableAI": false,
  "openaiApiKey": ""
}

Output Format

Each scraped post produces a dataset item:

{
  "type": "reddit_post",
  "post": {
    "post_id": "abc123",
    "subreddit": "startups",
    "title": "Post title",
    "body": "Post content...",
    "author": "username",
    "score": 150,
    "upvote_ratio": 0.95,
    "num_comments": 42,
    "awards": ["Gold"],
    "flair": "Discussion",
    "created_utc": "2024-01-15T10:30:00Z",
    "post_age_hours": 24.5,
    "url": "https://...",
    "permalink": "/r/startups/comments/...",
    "is_nsfw": false,
    "is_locked": false,
    "is_archived": false
  },
  "comments": [
    {
      "comment_id": "xyz789",
      "parent_id": "abc123",
      "author": "commenter",
      "body": "Comment text...",
      "score": 25,
      "depth": 0,
      "created_utc": "2024-01-15T11:00:00Z",
      "is_op": false,
      "is_deleted": false
    }
  ],
  "ai": {
    "sentiment": 0.75,
    "topics": ["entrepreneurship", "funding", "product-market-fit"],
    "entities": ["Stripe", "Y Combinator", "Series A"]
  },
  "scraped_at": "2024-01-15T12:00:00Z",
  "source": "reddit"
}

Local Development

# Install dependencies
pip install -r requirements.txt

# Install Playwright browsers
playwright install chromium

# Run locally
apify run

Deployment

# Login to Apify
apify login

# Deploy to Apify platform
apify push

Configuration

Proxy Settings

Residential proxies are strongly recommended for Reddit scraping:

{
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": ["RESIDENTIAL"]
  }
}

AI Processing

The actor supports three AI providers for sentiment analysis, topic extraction, and entity recognition:

Provider	Models	Default
OpenAI	gpt-4o, gpt-4o-mini, gpt-3.5-turbo	gpt-4o
Gemini	gemini-2.0-flash, gemini-1.5-pro	gemini-2.0-flash
Anthropic	claude-3-5-sonnet, claude-3-5-haiku	claude-3-5-sonnet

Configuration Example:

{
  "enableAI": true,
  "aiProvider": "openai",
  "aiModel": "gpt-4o",
  "openaiApiKey": "sk-..."
}

For Gemini:

{
  "enableAI": true,
  "aiProvider": "gemini",
  "geminiApiKey": "AIza..."
}

For Anthropic:

{
  "enableAI": true,
  "aiProvider": "anthropic",
  "anthropicApiKey": "sk-ant-..."
}

Architecture

src/
├── __init__.py          # Package init
├── __main__.py          # Entry point
├── main.py              # Actor initialization and orchestration
├── items.py             # Scrapy item definitions
├── middlewares.py       # Anti-ban middlewares
├── pipelines.py         # Data processing pipelines
├── settings.py          # Scrapy configuration
├── agents/              # LangGraph orchestration
│   ├── __init__.py
│   └── graph.py
├── ai/                  # LlamaIndex AI processing
│   ├── __init__.py
│   └── processor.py
└── spiders/             # Scrapy spiders
    ├── __init__.py
    └── reddit_spider.py

Reddit Scraper

alex_claw/reddit-scraper

Alex Claw

Reddit Scraper

khaki_yak/reddit-scraper

AI Automation

Reddit Scraper

janbruinier/jan-reddit-scraper

Scrape posts and comments from Reddit

Jan Bruinier

Reddit API Scraper

comchat/reddit-api-scraper

Reddit Scraper is a powerful tool that allows you to extract data from Reddit such as posts by keyword. With Reddit Scraper, you can easily gather valuable information from Reddit without the need to log in. You can easily use this Reddit scraper as an alternative API.

Comchat

1.8K

4.0

Reddit Email Scraper

clothefobia/reddit-email-scraper

Reddit Email Scraper- Scrap Reddit profile emails from search engine using keyword based

clothe fobia

Reddit Email Scraper

bhansalisoft/reddit-email-scraper

Reddit Email Scraper- Scrap Emails from Reddit specific profile using google search engine

bhansalisoft

176

Reddit Email Scraper - Advanced, Fast & Cheapest

contacts-api/reddit-email-scraper-fast-advanced-and-cheapest

👽 Reddit Email Scraper lets you collect emails from Reddit profiles and linked sources 🔎 Ideal for community research and brand outreach 📧

Lead Heaven

Reddit Advanced Scraper

crazee-media/reddit-advanced-scraper

The Advanced Reddit Scraper is able to scrape reddit posts and comments and returns them in 1 output item per post.

Crazee Media

Reddit Keyword Scraper

victorious_podcast/reddit-keyword-scraper

Scrap Reddit using keywords

Alexandre AMIEL

Reddit Users Actor

pintostudio/reddit-users-actor

Looking on ways to get user profile information, their posts, or their comments with customizable filters and pagination from Reddit? The Reddit User Scraper is an Apify Actor that extracts data from Reddit user profiles.