Reddit Scraper — Posts & Comments | from $1.50/1K
Pricing
Pay per usage
Reddit Scraper — Posts & Comments | from $1.50/1K
Scrape Reddit posts, comments, and user activity from any public subreddit. Returns 25+ fields: score, upvote ratio, flair, author, timestamps, parse_confidence. No API key needed — backed by Arctic Shift archive with unlimited historical depth. MCP-callable.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
Vitalii Bondarev
Maintained by CommunityActor stats
0
Bookmarked
5
Total users
4
Monthly active users
3 days ago
Last modified
Categories
Share
Built for brand monitors, growth researchers, and AI agent pipelines that need Reddit data at scale without OAuth limits.
Pricing: $1.50 per 1,000 posts · $0.50 per 1,000 comments (when includeComments=true)
Reddit Scraper lets you scrape Reddit posts, comments, and user activity from any public subreddit — no API key, no OAuth, no proxy required. Returns 25+ fields per record including score, upvote ratio, flair, author, and timestamps. Backed by the Arctic Shift Reddit archive for unlimited historical depth — no 1000-post-per-subreddit cap that live Reddit imposes. MCP-callable for AI agents. Pay only per result scraped.
Why This Reddit Scraper Beats the Alternatives
| This scraper | trudax/reddit-scraper-lite | practicaltools/apify-reddit-api | |
|---|---|---|---|
| Price | $1.50/1000 | $3.40/1000 | $2.00/1000 |
| No proxy cost to buyer | ✓ | ✗ | ✗ |
| Historical data (no 1000-post cap) | ✓ | ✗ | ✗ |
| No OAuth API dependency | ✓ | ✓ | ✗ |
| parse_confidence in every record | ✓ | ✗ | ✗ |
| 25+ fields | ✓ | ✓ | partial |
| Comments included | ✓ | partial | ✗ |
Key advantage: Competitors hitting live Reddit directly require residential proxy to avoid 403s — that cost passes to you. This actor uses Arctic Shift (free Reddit archive API) as its backend, so you pay only for results, not proxy overhead.
Reddit Data Fields
| Field | Posts | Comments |
|---|---|---|
| id | ✓ | ✓ |
type (post/comment) | ✓ | ✓ |
| subreddit | ✓ | ✓ |
| title | ✓ | — |
| body | ✓ | ✓ |
| author | ✓ | ✓ |
| score | ✓ | ✓ |
| upvote_ratio | ✓ | — |
| num_comments | ✓ | — |
| created_utc (ISO-8601) | ✓ | ✓ |
| permalink | ✓ | ✓ |
| url | ✓ | ✓ |
| is_self | ✓ | — |
| over_18 (NSFW) | ✓ | — |
| flair_text | ✓ | ✓ |
| domain | ✓ | — |
| subreddit_subscribers | ✓ | ✓ |
| parent_id | — | ✓ |
| depth | — | ✓ |
| is_submitter (OP?) | — | ✓ |
| parse_confidence | ✓ | ✓ |
| warnings | ✓ | ✓ |
| scraped_at | ✓ | ✓ |
What parse_confidence Means
Every Reddit record includes a score from 0.0 to 1.0:
- 1.0 — all fields parsed cleanly
- 0.9–0.95 — minor field missing (e.g. deleted author)
- < 0.5 — critical issue (missing ID, no data returned)
warnings lists machine-readable codes explaining any deductions — broken scrapes are visible in your dataset, not silently hidden.
Reddit Scraper Use Cases
- Brand monitoring — track keyword mentions across niche subreddits
- Lead generation — find users asking questions your product solves
- Sentiment analysis — bulk-export posts and comments for NLP pipelines
- Competitor research — monitor product-related subreddits
- Content strategy — find top-performing posts by score or comment count
- AI agent memory — feed recent subreddit discussion into agent context
How to Use Reddit Scraper
Scrape Reddit Subreddit Posts
{"subreddits": ["python", "learnpython"],"sort": "new","maxItems": 200,"includeComments": false}
Scrape Reddit Posts + Comments Together
{"subreddits": ["entrepreneur"],"sort": "new","maxItems": 100,"includeComments": true,"maxCommentsPerPost": 25}
Scrape Reddit User Activity
{"users": ["spez", "some_username"],"maxItems": 50}
Scrape via Reddit URL
{"urls": ["https://www.reddit.com/r/investing/"],"maxItems": 200}
Input Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
subreddits | string[] | — | Subreddit names (e.g. python, r/flask) |
urls | string[] | — | Reddit subreddit or profile URLs |
users | string[] | — | Usernames to scrape (e.g. spez) |
sort | new/old | new | Sort order |
maxItems | integer | 100 | Max posts per subreddit or user |
includeComments | boolean | false | Also scrape comments |
maxCommentsPerPost | integer | 50 | Max comments per post |
Sample Output
{"type": "post","id": "1d2e3f4","subreddit": "python","title": "What's the best async HTTP library in 2026?","body": "Looking for recommendations for an async HTTP client...","author": "user123","score": 847,"upvote_ratio": 0.97,"num_comments": 62,"created_utc": "2026-05-20T14:32:11+00:00","permalink": "/r/python/comments/1d2e3f4/whats_the_best_async_http_library_in_2026/","url": "https://www.reddit.com/r/python/comments/1d2e3f4/","flair_text": "Discussion","subreddit_subscribers": 1200000,"parse_confidence": 1.0,"warnings": [],"scraped_at": "2026-06-05T09:00:00+00:00"}
Pricing — Pay Per Reddit Post or Comment
$1.50 per 1,000 posts · $0.50 per 1,000 comments (when includeComments=true) — PPE, no per-run fee. No proxy cost — Reddit data is fetched via Arctic Shift at no additional infrastructure charge. First $5 Apify credit covers ~3,300 post records.
Data Source & Freshness
This actor fetches from Arctic Shift (arctic-shift.photon-reddit.com), a community-maintained Reddit archive based on historical data dumps. Data is updated continuously with an approximate 36-hour lag on engagement metrics (score, num_comments) for very recent posts. Historical data goes back years with no per-subreddit post cap.
Arctic Shift is a free service with no uptime SLA. The parse_confidence and warnings fields in every record surface any API anomalies so you can filter them downstream.
Use with AI Agents (MCP)
This Reddit scraper is callable as a tool by AI agents (Claude Desktop, Cursor, VS Code, n8n, LangGraph, CrewAI, or any MCP-compatible client) via Apify's hosted Model Context Protocol server.
{"mcpServers": {"apify": {"command": "npx","args": ["mcp-remote","https://mcp.apify.com/?tools=bovi/reddit-scraper","--header","Authorization: Bearer <YOUR_APIFY_TOKEN>"]}}}
Keep maxItems low (e.g. 25) when calling from agents to limit token volume.
Frequently Asked Questions
Does this Reddit scraper need an API key? No. It uses Arctic Shift (a community Reddit archive), not the official Reddit API. No OAuth, no app registration.
Why is there a 36-hour lag? Arctic Shift syncs from Reddit data dumps continuously. Very recent posts (< 36h) may have slightly outdated score and num_comments — all other fields are accurate.
Can I get more than 1000 posts from a subreddit?
Yes. Unlike live Reddit, Arctic Shift has no 1000-post cap. Use maxItems to control volume; the actor paginates via timestamps.
Is residential proxy needed? No — this actor does not hit live Reddit endpoints. No proxy cost to you.
Brand Monitoring & Incremental Scraping
Use sinceDate and Apify schedules to run this actor daily and get only new posts for ongoing brand-monitoring workflows. Set includeComments=true and a low maxCommentsPerPost for lightweight recurring runs that track sentiment changes over time.
Not affiliated with Reddit. Data retrieved from Arctic Shift, a community-maintained public Reddit archive.
Integrations
Built for social-listening and research teams tracking communities, trends, and sentiment at scale — the JSON/dataset output drops into the tools you already run, no glue code:
- n8n / Make / Zapier — trigger a run or pipe every new dataset item into 500+ apps (Google Sheets, Airtable, Slack, HubSpot, your database) with no code: n8n, Make, Zapier.
- Webhooks — fire your own endpoint the moment a run finishes, to push results straight into your pipeline (docs).
- MCP server — expose this actor as a tool to Claude, Cursor, or any MCP client so an AI agent can pull this data mid-conversation (guide).
- API & SDKs — fetch the dataset as JSON, CSV, or Excel through the Apify REST API or the Python / JS SDKs.
See all Apify integrations.