Pricing

from $10.00 / 1,000 results

Reddit Intelligence Scraper (Pay per Event)

Scrape Reddit posts, full comment trees, user profiles, and search results. Features subreddit monitoring with webhook alerts, batch comparison across multiple subreddits, and AI-native markdown output ready for LLM pipelines and vector databases.

Pricing

from $10.00 / 1,000 results

Rating

0.0

(0)

Developer

Eimantas V

Actor stats

Bookmarked

Total users

Monthly active users

18 days ago

Last modified

Reddit Intelligence Scraper

Extract posts, full comment trees, user profiles, search results, and trending topics from Reddit — with AI-native structured output designed to drop directly into LLM pipelines, vector databases, and RAG systems without preprocessing.

🚀 What Can This Reddit Scraper Extract?

Data Type	Fields Extracted
Posts	Title, body (markdown + plain text), score, upvote ratio, awards, flair, author, timestamps, crosspost data
Comments	Full nested tree (all depths), per-comment score, author, edited flag, reply count
Users	Karma breakdown, account age, post/comment history, profile bio
Search Results	Full-text Reddit search with subreddit filtering, sorting, and time windows
Subreddit Metadata	Subscriber count, active users, description, creation date, icons
Batch Comparison	Side-by-side stats for 10+ subreddits in a single run

✨ Key Features

🔄 Subreddit monitoring mode — Poll any subreddit for new posts matching keyword filters and deliver alerts via webhook in real-time
🌲 Full comment tree traversal — Not just top-level comments. Fetches deeply nested replies via Reddit's morechildren API, up to configurable depth
🤖 AI-native output — Every result includes a _markdown_document field: a clean, structured markdown document ready for LLM context windows or vector embedding
📊 Batch subreddit comparison — Pull top posts from up to 20 subreddits in one run with aggregated stats — ideal for market research and competitive analysis
⚡ Reliable session rotation — Rotates User-Agents, respects X-Ratelimit-* headers, and uses exponential backoff — the #1 failure mode for Reddit scrapers, solved
🔍 Advanced filtering — Filter by flair, keyword, score threshold, date range, NSFW flag, and sort order (hot/new/top/rising)
📋 Schema-versioned output — Every item carries _schema: "reddit-intelligence/v1" so your pipeline always knows what it's getting

📖 How to Use the Reddit Intelligence Scraper

Step 1 — Choose a mode

Mode	What it does
`subreddit`	Scrape posts from one or more subreddits
`post`	Scrape a specific post URL with all comments
`user`	Scrape a user's profile, posts, and comment history
`search`	Full-text Reddit search with filters
`batch`	Compare top posts across multiple subreddits
`monitor`	Watch subreddits for new posts and deliver webhook alerts

Step 2 — Configure your run

Scrape the top posts from r/MachineLearning this week:

{
  "mode": "subreddit",
  "subreddits": ["MachineLearning"],
  "sortBy": "top",
  "timeFilter": "week",
  "maxPostsPerSubreddit": 50,
  "includeComments": true,
  "maxCommentsPerPost": 100,
  "outputFormat": "both"
}

Compare 5 subreddits for market research:

{
  "mode": "batch",
  "subreddits": ["entrepreneur", "startups", "SaaS", "indiehackers", "smallbusiness"],
  "sortBy": "top",
  "timeFilter": "month",
  "maxPostsPerSubreddit": 10
}

Monitor r/ArtificialIntelligence for mentions of "GPT" and alert via webhook:

{
  "mode": "monitor",
  "subreddits": ["ArtificialIntelligence"],
  "keywordFilter": ["GPT", "Claude", "Gemini", "LLM"],
  "monitoringInterval": 5,
  "webhookUrl": "https://your-server.com/webhooks/reddit"
}

Scrape a specific post with full comment tree:

{
  "mode": "post",
  "postUrls": ["https://www.reddit.com/r/MachineLearning/comments/abc123/example_post/"],
  "maxCommentsPerPost": 500,
  "commentDepth": 10
}

Step 3 — Use the output

Every post result includes a ready-to-use markdown document in the _markdown_document field:

# Why GPT-4 is changing enterprise software

**r/MachineLearning** | Score: **4,231** (96% upvoted) | Comments: **312**
Author: u/ml_researcher | Posted: 2024-03-15T14:22:00Z

## Post Content

The shift from rule-based to generative AI...

## Top Comments

### u/ai_engineer (Score: 847)
This is exactly what we're seeing in production...

> #### u/skeptic99 (Score: 234)
> Worth noting the cost implications here...

Paste this directly into your LLM prompt or chunk it for RAG.

💰 How Much Does It Cost to Scrape Reddit?

Reddit Intelligence Scraper is priced per result (pay-per-event):

Task	Approximate Cost
1,000 posts (metadata only)	~$3.00
1,000 posts with 100 comments each	~$3.00
User profile (1 user, 25 posts)	~$0.12
Batch comparison (10 subs × 10 posts)	~$0.30
Monitor run (24h, low-traffic sub)	~$1.50–6.00

Pricing: $3.00 per 1,000 results. Each post, comment thread, user profile, or search result page counts as one result.

Tip: Disable includeComments and set outputFormat: "json" for faster runs when you only need post metadata.

📤 Output Format

Post object (JSON)

{
  "_schema": "reddit-intelligence/v1",
  "_scraped_at": "2024-03-15T14:30:00.000Z",
  "type": "post",
  "id": "abc123",
  "url": "https://www.reddit.com/r/MachineLearning/comments/abc123/...",
  "subreddit": "MachineLearning",
  "title": "Why GPT-4 is changing enterprise software",
  "body_markdown": "The shift from rule-based to generative AI...",
  "body_text": "The shift from rule-based to generative AI...",
  "score": 4231,
  "upvote_ratio": 0.96,
  "num_comments": 312,
  "total_awards_received": 7,
  "flair_text": "Discussion",
  "author": "ml_researcher",
  "created_utc": "2024-03-15T14:22:00.000Z",
  "comments": [...],
  "subreddit_meta": {
    "subscribers": 2800000,
    "active_user_count": 4200,
    ...
  },
  "_markdown_document": "# Why GPT-4 is changing..."
}

Webhook payload (monitor mode)

{
  "event": "keyword_match",
  "timestamp": "2024-03-15T14:35:00.000Z",
  "subreddit": "ArtificialIntelligence",
  "matched_keywords": ["GPT", "LLM"],
  "post": {
    "id": "xyz789",
    "title": "New GPT-4 benchmark results are wild",
    "url": "https://www.reddit.com/r/...",
    "score": 142,
    "body_preview": "Just ran the full MMLU suite..."
  }
}

🤔 Frequently Asked Questions

Is scraping Reddit legal?

Reddit's public data is accessible without authentication. This actor only scrapes publicly available content — the same data accessible in a browser without logging in. Always review Reddit's Terms of Service and ensure your use complies with applicable laws. This tool is intended for research, analytics, and AI training use cases.

Why is the actor not returning all 1000 posts I requested?

Reddit's top sort with longer time windows (year, all) is the best way to get high-quality historical posts. The API occasionally returns fewer results than requested — this is a Reddit limitation. Try increasing maxPostsPerSubreddit and setting sortBy: "new" for completeness.

What's the difference between `outputFormat: "json"`, `"markdown"`, and `"both"`?

json — returns the full structured JSON object, ideal for data pipelines and databases
markdown — returns only the _markdown_document field (the AI-ready version), minimal storage
both — returns full JSON and the markdown document in every result

Can I use this to feed Reddit data into a vector database?

Yes — this is a primary use case. Use outputFormat: "markdown", split on ## Top Comments to get post and comment chunks, and embed each chunk separately.

Does it handle private or restricted subreddits?

No. This actor only accesses public Reddit content. Private subreddits require OAuth authentication with approved account credentials.

How does the monitoring mode work exactly?

On first run, the actor seeds its state with the current latest posts (no webhook fires). On subsequent polling cycles (default: every 5 minutes), any new post that matches your filters triggers a dataset push and optionally a webhook. State is persisted in Apify Key-Value Store so it survives between runs.

Why use a proxy?

Without a proxy, repeated scraping from a single IP can trigger Reddit's rate limiting (HTTP 429). Apify's residential proxy pool rotates IPs automatically, making your scraper much more reliable at scale.

Web Scraper — General-purpose web scraping
Twitter Scraper — Social media monitoring on X/Twitter
YouTube Scraper — Video and comment data from YouTube

📬 Support & Feedback

Found a bug or have a feature request? Open an issue or contact us through the Apify platform. We monitor this actor actively and publish updates regularly.

Reddit Scraper - Posts, Comments & Subreddits

viralanalyzer/reddit-scraper

Extract Reddit posts, comments, subreddit data, and user profiles.

viralanalyzer

5.0

Reddit Scraper

optimus-fulcria/reddit-scraper

Scrape Reddit posts, comments, and subreddit data. Full nested comment threads, search queries, user profiles.

Fulcria Labs

Reddit Posts & Comments Scraper

rupom888/reddit-posts-scraper

Scrape Reddit posts, comments, subreddits, and user profiles without login. Search by keyword across Reddit or within a subreddit. Extract post scores, vote ratios, comment counts, awards, flairs, and full comment threads. Uses Reddit's public JSON API — fast and reliable.

Syed Rupom

Reddit Scraper — Posts, Comments, Subreddits | MCP + AI

scrape.badger/reddit-scraper

Scrape Reddit posts, users, subreddits and other data with affordable ScrapeBadger Reddit Scraper. High success rates and fast support. 20 modes: posts, comment trees, subreddits, rules, wiki, user data, keyword & domain search, trending. No Reddit API key needed. From $1.00/1K items.

Scrape Badger

Reddit Scraper

gentle_cloud/reddit-scraper

Scrape posts and comments from any Reddit subreddit. Supports multiple subreddits, search, sorting, time filters, and optional comment extraction — no API key required.

Monkey Coder

Reddit Scraper

labrat011/reddit-scraper

Scrape Reddit posts, comments, search results, and user profiles. No API keys or browser needed. Supports 4 modes: subreddit posts (hot/new/top/rising), Reddit search, user profiles, and full comment trees. Fast, lightweight HTTP-based scraping with built-in rate limiting and retry logic.

mick_

Reddit Post & Comment Scraper

miccho27/reddit-post-scraper

Scrape Reddit posts and comments from any subreddit or thread URL. Extract titles, scores, authors, comment trees, and metadata. No Reddit API key or OAuth required.

Tatsuya Mizuno

Reddit Scraper - Posts, Comments, Search, Users (no login)

constructive_calm/reddit-scraper-pro

Scrape Reddit subreddit posts, search results, full comment trees, and user history. Cookie-free, reliable (>95% success), and cheaper than incumbents. Built-in monitor mode for new-post alerts.

Omar Eldeeb

5.0

Reddit Scraper - Posts, Comments, Subreddits & Users

makework36/reddit-scraper

Fast, reliable Reddit scraper. Extract posts, comments, subreddits & users from any subreddit without Reddit API keys or login. AI-ready JSON for LLM training, sentiment analysis, lead generation. Export JSON/CSV/Excel.