Reddit Scraper - $0.75/1k
Pricing
from $0.75 / 1,000 items
Reddit Scraper - $0.75/1k
Scrape Reddit posts and comments from any subreddit, search query, or user profile. Returns title, author, full post text, external link URL, comment bodies, subreddit, and timestamps. No login required. Pay-per-result: only $0.75 per 1,000 items.
Pricing
from $0.75 / 1,000 items
Rating
5.0
(1)
Developer
Ale
Maintained by CommunityActor stats
0
Bookmarked
14
Total users
9
Monthly active users
4 days ago
Last modified
Categories
Share
Reddit Scraper
Scrape Reddit posts and comments from any subreddit. Extract titles, scores, comment text, authors, and nested reply threads at scale. No API key or login required.
What It Does
Fetches posts from one or more subreddits using Reddit's public Atom feeds (/.rss). Optionally fetches comments for each post. Posts and comments are returned as separate items in the dataset.
Use with AI Agents (MCP)
Connect this actor to any MCP-compatible AI client — Claude Desktop, Claude.ai, Cursor, VS Code, LangChain, LlamaIndex, or custom agents.
Apify MCP server URL:
https://mcp.apify.com?tools=santamaria-automations/reddit-scraper
Example prompt once connected:
"Use
reddit-scraperto get the top 50 posts from r/MachineLearning this week with comments. Return results as a table showing title, score, and comment count."
Features
- Multi-subreddit — scrape multiple subreddits in a single run
- Comments included — fetch comments for each post (flat list, see notes below)
- Search — query Reddit-wide, sort by relevance / top / new / hot
- User profiles — pull a user's posts and comments
- Sorting options — hot, new, top, rising
- Deduplication — the same post is never returned twice
- Pagination — uses Reddit's
aftercursor to collect more than 100 posts - Anti-bot resilient — TLS fingerprinted sessions, rotating proxy IPs
- Rate-limit aware — paces requests well under Reddit's public limits
- No credentials needed — uses Reddit's public Atom feeds
- Pay-per-result — only pay for items you receive
Data Extracted
Posts (type = "post")
| Field | Example |
|---|---|
id | "abc123" |
type | "post" |
subreddit | "programming" |
title | "Show HN: I built a Go-based Reddit scraper" |
author | "john_doe" |
text | "Full text of a self post..." |
url | "https://github.com/example/repo" |
score | 0 (see notes) |
num_comments | null (see notes) |
is_stickied | false |
created_utc | "2026-04-25T10:00:00Z" |
reddit_url | "https://www.reddit.com/r/programming/comments/..." |
scraped_at | "2026-04-25T10:30:00Z" |
Comments (type = "comment")
| Field | Example |
|---|---|
id | "xyz789" |
type | "comment" |
subreddit | "programming" |
author | "helpful_user" |
text | "Great project! Have you considered..." |
score | 0 (see notes) |
parent_id | null (see notes) |
post_id | "abc123" |
post_title | "Show HN: I built a Go-based Reddit scraper" |
is_stickied | false |
created_utc | "2026-04-25T11:15:00Z" |
reddit_url | "https://www.reddit.com/r/programming/comments/.../xyz789/" |
scraped_at | "2026-04-25T11:30:00Z" |
Notes on score, num_comments, and parent_id: Reddit's public Atom feeds do not expose vote totals, comment counts, or comment parent-IDs. These fields are returned as 0 / null. For everything else (title, author, body text, URLs, timestamps), the data is identical to what you'd see on the site.
Pricing
Pay-per-result pricing. You only pay for items you receive.
| Event | Price | Description |
|---|---|---|
| Actor start | $0.005 | One-time container startup fee |
| Item scraped | $0.75 / 1,000 | Each post or comment returned |
Examples:
- 100 posts (no comments) = $0.08 total ($0.005 + $0.075)
- 100 posts + 500 comments = $0.455 total ($0.005 + $0.45)
- 1,000 posts + 5,000 comments = $4.505 total ($0.005 + $4.50)
6x cheaper than competing Reddit scrapers ($5/1k+). No monthly fees. No minimum spend.
No monthly fees. No minimum spend.
Input
| Field | Type | Description | Default |
|---|---|---|---|
subreddits | string[] | Subreddit names to scrape (no r/ prefix) | ["programming"] |
searchQuery | string | Search Reddit for matching posts (overrides subreddits) | — |
usernames | string[] | Scrape all posts/comments from these users (no u/ prefix) | — |
sort | string | hot, new, top, rising (or relevance for search) | hot |
includeComments | boolean | Fetch comments for each post | false |
commentDepth | integer | Nesting depth: 1=top-level, 2=+replies, up to 5 | 3 |
maxCommentsPerPost | integer | Max comments per post. 0 = unlimited. | 100 |
maxResults | integer | Max posts to return (across all subreddits). 0 = unlimited. | 100 |
proxyConfiguration | object | Apify proxy settings | Auto |
Usage Examples
Scrape hot posts from multiple subreddits
{"subreddits": ["programming", "python", "golang"],"sort": "hot","maxResults": 200}
Get top posts with full comment threads
{"subreddits": ["MachineLearning"],"sort": "top","includeComments": true,"commentDepth": 3,"maxCommentsPerPost": 50,"maxResults": 100}
Scrape new posts without comments
{"subreddits": ["startups", "SaaS"],"sort": "new","maxResults": 500}
Search Reddit
{"searchQuery": "artificial intelligence startup","sort": "top","maxResults": 50}
Scrape a user's activity (experimental)
{"usernames": ["AutoModerator"],"maxResults": 50}
Note: Reddit applies stricter rate limiting on user profile pages. Some users may return fewer results.
Deep comment mining from a single subreddit
{"subreddits": ["AskReddit"],"sort": "hot","includeComments": true,"commentDepth": 5,"maxCommentsPerPost": 200,"maxResults": 20}
Output
Results are exported to the default dataset. Posts and comments are interleaved — each post is followed by its comments (if includeComments is enabled). Use the type field to filter posts vs comments.
Export to JSON, CSV, Excel, or connect via the Apify API.
FAQ
Do I need a Reddit account or API key? No. This scraper uses Reddit's public Atom feeds which are accessible without authentication.
What is the rate limit? The scraper paces requests at ~1 every 1.5 seconds. With proxy rotation enabled (default), you can run multiple actors in parallel without hitting limits.
Can I scrape private subreddits? No. Only public subreddits and posts visible without logging in are accessible.
How are comments structured?
Each comment is a separate output item with type: "comment". The post_id and post_title fields reference the original post. Comments come back as a flat list — the parent-comment relationship is not exposed by the Atom feed.
Why no score or num_comments?
Reddit removed access to its /.json endpoints in June 2026. The Atom (/.rss) feed is the only public surface that still works without a registered API app, and it doesn't expose vote counts. If you need scores, use Reddit's official OAuth API.
Why do I need a proxy? A proxy isn't required, but Reddit will rate-limit a single IP after a few quick requests. Apify's auto/datacenter proxy is sufficient — no residential proxy needed.