Reddit Scraper — Posts & Comments | from $1.50/1K avatar

Reddit Scraper — Posts & Comments | from $1.50/1K

Pricing

Pay per usage

Go to Apify Store
Reddit Scraper — Posts & Comments | from $1.50/1K

Reddit Scraper — Posts & Comments | from $1.50/1K

Scrape Reddit posts, comments, and user activity from any public subreddit. Returns 25+ fields: score, upvote ratio, flair, author, timestamps, parse_confidence. No API key needed — backed by Arctic Shift archive with unlimited historical depth. MCP-callable.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Vitalii Bondarev

Vitalii Bondarev

Maintained by Community

Actor stats

0

Bookmarked

5

Total users

4

Monthly active users

3 days ago

Last modified

Share

Built for brand monitors, growth researchers, and AI agent pipelines that need Reddit data at scale without OAuth limits.

Pricing: $1.50 per 1,000 posts · $0.50 per 1,000 comments (when includeComments=true)

Reddit Scraper lets you scrape Reddit posts, comments, and user activity from any public subreddit — no API key, no OAuth, no proxy required. Returns 25+ fields per record including score, upvote ratio, flair, author, and timestamps. Backed by the Arctic Shift Reddit archive for unlimited historical depth — no 1000-post-per-subreddit cap that live Reddit imposes. MCP-callable for AI agents. Pay only per result scraped.

Why This Reddit Scraper Beats the Alternatives

This scrapertrudax/reddit-scraper-litepracticaltools/apify-reddit-api
Price$1.50/1000$3.40/1000$2.00/1000
No proxy cost to buyer
Historical data (no 1000-post cap)
No OAuth API dependency
parse_confidence in every record
25+ fieldspartial
Comments includedpartial

Key advantage: Competitors hitting live Reddit directly require residential proxy to avoid 403s — that cost passes to you. This actor uses Arctic Shift (free Reddit archive API) as its backend, so you pay only for results, not proxy overhead.

Reddit Data Fields

FieldPostsComments
id
type (post/comment)
subreddit
title
body
author
score
upvote_ratio
num_comments
created_utc (ISO-8601)
permalink
url
is_self
over_18 (NSFW)
flair_text
domain
subreddit_subscribers
parent_id
depth
is_submitter (OP?)
parse_confidence
warnings
scraped_at

What parse_confidence Means

Every Reddit record includes a score from 0.0 to 1.0:

  • 1.0 — all fields parsed cleanly
  • 0.9–0.95 — minor field missing (e.g. deleted author)
  • < 0.5 — critical issue (missing ID, no data returned)

warnings lists machine-readable codes explaining any deductions — broken scrapes are visible in your dataset, not silently hidden.

Reddit Scraper Use Cases

  • Brand monitoring — track keyword mentions across niche subreddits
  • Lead generation — find users asking questions your product solves
  • Sentiment analysis — bulk-export posts and comments for NLP pipelines
  • Competitor research — monitor product-related subreddits
  • Content strategy — find top-performing posts by score or comment count
  • AI agent memory — feed recent subreddit discussion into agent context

How to Use Reddit Scraper

Scrape Reddit Subreddit Posts

{
"subreddits": ["python", "learnpython"],
"sort": "new",
"maxItems": 200,
"includeComments": false
}

Scrape Reddit Posts + Comments Together

{
"subreddits": ["entrepreneur"],
"sort": "new",
"maxItems": 100,
"includeComments": true,
"maxCommentsPerPost": 25
}

Scrape Reddit User Activity

{
"users": ["spez", "some_username"],
"maxItems": 50
}

Scrape via Reddit URL

{
"urls": ["https://www.reddit.com/r/investing/"],
"maxItems": 200
}

Input Parameters

ParameterTypeDefaultDescription
subredditsstring[]Subreddit names (e.g. python, r/flask)
urlsstring[]Reddit subreddit or profile URLs
usersstring[]Usernames to scrape (e.g. spez)
sortnew/oldnewSort order
maxItemsinteger100Max posts per subreddit or user
includeCommentsbooleanfalseAlso scrape comments
maxCommentsPerPostinteger50Max comments per post

Sample Output

{
"type": "post",
"id": "1d2e3f4",
"subreddit": "python",
"title": "What's the best async HTTP library in 2026?",
"body": "Looking for recommendations for an async HTTP client...",
"author": "user123",
"score": 847,
"upvote_ratio": 0.97,
"num_comments": 62,
"created_utc": "2026-05-20T14:32:11+00:00",
"permalink": "/r/python/comments/1d2e3f4/whats_the_best_async_http_library_in_2026/",
"url": "https://www.reddit.com/r/python/comments/1d2e3f4/",
"flair_text": "Discussion",
"subreddit_subscribers": 1200000,
"parse_confidence": 1.0,
"warnings": [],
"scraped_at": "2026-06-05T09:00:00+00:00"
}

Pricing — Pay Per Reddit Post or Comment

$1.50 per 1,000 posts · $0.50 per 1,000 comments (when includeComments=true) — PPE, no per-run fee. No proxy cost — Reddit data is fetched via Arctic Shift at no additional infrastructure charge. First $5 Apify credit covers ~3,300 post records.

Data Source & Freshness

This actor fetches from Arctic Shift (arctic-shift.photon-reddit.com), a community-maintained Reddit archive based on historical data dumps. Data is updated continuously with an approximate 36-hour lag on engagement metrics (score, num_comments) for very recent posts. Historical data goes back years with no per-subreddit post cap.

Arctic Shift is a free service with no uptime SLA. The parse_confidence and warnings fields in every record surface any API anomalies so you can filter them downstream.

Use with AI Agents (MCP)

This Reddit scraper is callable as a tool by AI agents (Claude Desktop, Cursor, VS Code, n8n, LangGraph, CrewAI, or any MCP-compatible client) via Apify's hosted Model Context Protocol server.

{
"mcpServers": {
"apify": {
"command": "npx",
"args": [
"mcp-remote",
"https://mcp.apify.com/?tools=bovi/reddit-scraper",
"--header",
"Authorization: Bearer <YOUR_APIFY_TOKEN>"
]
}
}
}

Keep maxItems low (e.g. 25) when calling from agents to limit token volume.

Frequently Asked Questions

Does this Reddit scraper need an API key? No. It uses Arctic Shift (a community Reddit archive), not the official Reddit API. No OAuth, no app registration.

Why is there a 36-hour lag? Arctic Shift syncs from Reddit data dumps continuously. Very recent posts (< 36h) may have slightly outdated score and num_comments — all other fields are accurate.

Can I get more than 1000 posts from a subreddit? Yes. Unlike live Reddit, Arctic Shift has no 1000-post cap. Use maxItems to control volume; the actor paginates via timestamps.

Is residential proxy needed? No — this actor does not hit live Reddit endpoints. No proxy cost to you.


Brand Monitoring & Incremental Scraping

Use sinceDate and Apify schedules to run this actor daily and get only new posts for ongoing brand-monitoring workflows. Set includeComments=true and a low maxCommentsPerPost for lightweight recurring runs that track sentiment changes over time.


Not affiliated with Reddit. Data retrieved from Arctic Shift, a community-maintained public Reddit archive.

Integrations

Built for social-listening and research teams tracking communities, trends, and sentiment at scale — the JSON/dataset output drops into the tools you already run, no glue code:

  • n8n / Make / Zapier — trigger a run or pipe every new dataset item into 500+ apps (Google Sheets, Airtable, Slack, HubSpot, your database) with no code: n8n, Make, Zapier.
  • Webhooks — fire your own endpoint the moment a run finishes, to push results straight into your pipeline (docs).
  • MCP server — expose this actor as a tool to Claude, Cursor, or any MCP client so an AI agent can pull this data mid-conversation (guide).
  • API & SDKs — fetch the dataset as JSON, CSV, or Excel through the Apify REST API or the Python / JS SDKs.

See all Apify integrations.