Reddit Scraper - Posts, Comments & Subreddit Analytics avatar

Reddit Scraper - Posts, Comments & Subreddit Analytics

Pricing

from $0.00005 / actor start

Go to Apify Store
Reddit Scraper - Posts, Comments & Subreddit Analytics

Reddit Scraper - Posts, Comments & Subreddit Analytics

Scrape Reddit posts, comments & subreddit analytics via JSON API. No browser, no login, no API key. Structured JSON for AI, research & monitoring. $0.003/result.

Pricing

from $0.00005 / actor start

Rating

0.0

(0)

Developer

Khadin Akbar

Khadin Akbar

Maintained by Community

Actor stats

0

Bookmarked

4

Total users

2

Monthly active users

13 hours ago

Last modified

Share

Reddit Posts, Comments & Subreddit Analytics Scraper

Extract Reddit posts, comments, upvotes, author data, and subreddit statistics — all in one actor. Powered by Reddit's public JSON API (no browser, no login, no API key required). The fastest and most reliable Reddit scraper on the Apify Store.

Compatible with: Apify MCP Server (Claude, ChatGPT, Cursor), LangChain, Make.com, Zapier, n8n, and direct REST API.


What does this Reddit scraper do?

This actor connects directly to Reddit's public .json API endpoints — the same data Reddit serves to its own apps. No headless browser is needed, which means it's faster, cheaper, and more reliable than other Reddit scrapers that render full web pages.

In one run you can:

  • Scrape posts from any subreddit (sorted by hot, new, top, or rising)
  • Search all of Reddit by keyword or topic
  • Extract full comment threads including nested replies
  • Pull subreddit analytics: subscriber count, active users, description, and metadata
  • Scrape specific posts by URL with their complete discussion

All three data types (posts, comments, subreddit analytics) are delivered in the same dataset, discriminated by a type field for easy filtering.


What data can you extract from Reddit?

Reddit Posts

FieldTypeExample
typestring"post"
idstring"t3_abc123"
titlestring"Best Python ML libraries in 2026?"
bodystring or null"I've been using scikit-learn..."
authorstring"u/john_doe"
subredditstring"r/MachineLearning"
subreddit_subscribersinteger2840000
scoreinteger1482
upvote_rationumber0.94
num_commentsinteger73
urlstring"https://reddit.com/r/..."
external_urlstring or null"https://arxiv.org/abs/..."
flairstring or null"Research Paper"
awards_countinteger5
post_typestring"text" / "link" / "image" / "video"
is_nsfwbooleanfalse
is_stickiedbooleanfalse
created_atISO datetime"2026-03-20T14:22:00.000Z"

Reddit Comments

FieldTypeExample
typestring"comment"
idstring"t1_xyz789"
post_idstring"t3_abc123"
parent_idstring"t3_abc123"
authorstring"u/jane_smith"
bodystring or null"I'd also recommend PyTorch..."
scoreinteger342
depthinteger0 (top-level), 1, 2 (nested)
is_submitterbooleanfalse
awards_countinteger2
created_atISO datetime"2026-03-20T15:10:00.000Z"

Subreddit Analytics

FieldTypeExample
typestring"subreddit_analytics"
namestring"MachineLearning"
titlestring"Machine Learning"
descriptionstring or null"A subreddit devoted to..."
subscribersinteger2840000
active_usersinteger or nullnull (Reddit API no longer exposes this reliably)
subreddit_typestring"public"
nsfwbooleanfalse
created_atISO datetime"2010-08-17T12:00:00.000Z"
icon_urlstring or null"https://styles.redditmedia.com/..."

How to scrape Reddit posts and comments

Option 1: Apify Console (no code)

  1. Click "Try for free" to open the actor
  2. Choose your mode:
    • Subreddit URLs: paste one or more subreddit URLs (e.g., https://reddit.com/r/MachineLearning)
    • Search query: type a keyword to search all of Reddit
    • Post URLs: paste direct links to specific posts
  3. Set max posts and max comments per post
  4. Choose sort order (hot, new, top, rising)
  5. Click Start and download results as JSON, CSV, or Excel

Option 2: Using with AI Agents (Claude, ChatGPT)

Connect via the Apify MCP Server and ask naturally:

"Find 100 top posts from r/MachineLearning this week with all their comments" "Search Reddit for posts about GPT-5 and get the top 50 results" "Get subreddit analytics for r/programming and r/webdev" "Scrape this Reddit thread: reddit.com/r/..."

The AI agent automatically maps your request to the correct input parameters and runs the actor.

Option 3: REST API

# Start a run
curl -X POST "https://api.apify.com/v2/acts/khadinakbar~reddit-posts-comments-scraper/runs" \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"subredditUrls": ["https://reddit.com/r/MachineLearning"],
"maxPosts": 50,
"sortBy": "top",
"timeFilter": "week",
"includeComments": true,
"maxCommentsPerPost": 50
}'

Option 4: JavaScript / Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('khadinakbar/reddit-posts-comments-scraper').call({
subredditUrls: ['https://reddit.com/r/MachineLearning'],
maxPosts: 50,
sortBy: 'top',
timeFilter: 'week',
includeComments: true,
maxCommentsPerPost: 50,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
const posts = items.filter(item => item.type === 'post');
const comments = items.filter(item => item.type === 'comment');
console.log(`Got ${posts.length} posts and ${comments.length} comments`);

Option 5: Python

from apify_client import ApifyClient
client = ApifyClient('YOUR_API_TOKEN')
run = client.actor('khadinakbar/reddit-posts-comments-scraper').call(
run_input={
'searchQuery': 'artificial intelligence regulation',
'maxPosts': 100,
'sortBy': 'relevance',
'timeFilter': 'month',
'includeComments': True,
'maxCommentsPerPost': 30,
}
)
items = list(client.dataset(run['defaultDatasetId']).iterate_items())
posts = [i for i in items if i['type'] == 'post']
print(f"Found {len(posts)} posts")

How to use Reddit scraper for common tasks

Brand monitoring and social listening

Search Reddit for your brand name, product name, or competitor keywords. Extract the posts with highest scores and most comments to find what the community is saying.

{
"searchQuery": "your product name",
"maxPosts": 200,
"sortBy": "relevance",
"timeFilter": "month",
"includeComments": true,
"maxCommentsPerPost": 50
}

Market research and sentiment analysis

Scrape posts from relevant subreddits to understand user pain points, feature requests, and community opinions. The score and upvote_ratio fields let you surface the most resonant content.

Content research and trend discovery

Sort by top with timeFilter: "week" or "month" across niche subreddits to find what content performs best in your industry.

Academic and data science research

Use the dataset for NLP projects, sentiment classification, or social network analysis. The nested comment tree is flattened with depth and parent_id fields for easy graph reconstruction.

Competitive analysis

Pull subreddit analytics for competitor-adjacent communities to benchmark audience size, growth trends, and engagement patterns.


How much does Reddit scraping cost?

Pricing is pay-per-result — you only pay for actual data extracted. No monthly fees, no minimum commitment.

Data typePrice per result
Reddit post$0.003
Reddit comment$0.003
Subreddit analytics$0.003

Example: Scraping 100 posts with 50 comments each = 100 + 5,000 = 5,100 results = $15.30 total.

Apify Free Tier includes $5 of monthly credits — enough to extract ~1,650 Reddit results for free every month.


Why this Reddit scraper vs. alternatives

Uses Reddit's JSON API (not a headless browser)

Reddit exposes public .json endpoints for every page — the same data their own apps consume. This actor hits those endpoints directly, which means:

  • 99%+ success rate — no JavaScript rendering failures, no CSS selector breakage
  • 3-5x faster than browser-based scrapers
  • Residential proxy built-in — Reddit blocks raw datacenter IPs; this actor routes through Apify's residential proxy pool automatically so you don't have to configure anything
  • More data fields — the API returns structured data that's cleaner than HTML-parsed content

Single actor, all Reddit data

Most Reddit scrapers handle posts OR comments, not both, and few include subreddit analytics. This actor returns all three in a single run, discriminated by a type field.

Fully MCP-optimized

Field names, descriptions, and output schema are designed for AI agent consumption via the Apify MCP Server. Claude, ChatGPT, and other LLMs can call this actor from natural language instructions without any additional configuration.


Integrations

  • Apify MCP Server — call from Claude, ChatGPT, Cursor, and any MCP-compatible AI agent
  • Make.com / Zapier / n8n — trigger runs and send results to Google Sheets, Slack, CRMs, and 1,000+ apps
  • LangChain — use as a tool in LangChain pipelines for RAG and AI workflows
  • Direct API — REST API with JSON/CSV/Excel output for any programming language

FAQ

Q: Do I need a Reddit account or API key? A: No. This actor uses Reddit's public JSON API which doesn't require authentication for public subreddit data.

Q: How many posts can I scrape? A: Up to 10,000 posts per run. Reddit's public API has a practical limit of ~1,000 posts per subreddit listing (the same limit as the old Reddit API), but for search queries and date-windowed scraping you can go much further.

Q: Can I scrape private or restricted subreddits? A: Only public subreddits are accessible without authentication. Private and restricted subreddits will return no data.

Q: Is this legal? A: This actor only accesses publicly available Reddit data that anyone can view in a browser. See Apify's guide to web scraping legality. Reddit's public JSON API is the same data source their own apps use.

Q: How fast is the scraper? A: Approximately 300-500 posts per minute and 200-400 comments per minute, depending on Reddit's response times. The actor respects Reddit's rate limits (60 requests/minute) automatically.

Q: Can I schedule recurring runs? A: Yes — use Apify's built-in Scheduler to run daily, weekly, or on any cron interval. Results can be sent to webhooks, email, or connected apps automatically.

Q: What happens if the scrape hits Reddit's rate limit? A: The actor automatically retries with exponential backoff. Failed requests are logged so you can see what wasn't fetched.

Q: How do I filter posts vs. comments in the output? A: Every record has a type field. Filter on type === 'post', type === 'comment', or type === 'subreddit_analytics' in your downstream processing or using Apify's dataset filtering API.