Pricing

from $2.00 / 1,000 results

Reddit Scraper — Posts, Comments, Search & Users (Reliable)

Scrape Reddit posts, comment trees, search results, and user activity as clean JSON — engineered for reliability: rate-limit-aware pacing, host + proxy rotation, and per-target fault isolation. HTTP-only, no browser, no login.

Pricing

from $2.00 / 1,000 results

Rating

0.0

(0)

Developer

William Fordyce

Actor stats

Bookmarked

Total users

Monthly active users

2 months ago

Last modified

Why this scraper succeeds where others fail

Reddit aggressively defends its public endpoints: rate limits (HTTP 429), IP blocks (HTTP 403), and a JavaScript "Please wait for verification" bot wall that silently breaks plain .json scrapers. That wall is exactly why other Reddit actors fail 40%+ of their runs. This actor was engineered around those failure modes from day one:

Native anonymous identity — instead of hammering the fragile public .json pages, the actor performs the exact same anonymous handshake reddit.com runs for every logged-out visitor and reads data through Reddit's own application gateway. No account, no login, no credentials — and no JS-challenge wall.
Rate-limit-aware pacing — the actor reads Reddit's x-ratelimit-remaining / x-ratelimit-reset response headers on every request and proactively slows down before a 429 ever happens, on top of politely jittered 1–2 s request spacing.
Four-host fallback ladder — if Reddit's gateway ever misbehaves, requests automatically fail over across four hosts that serve identical data (oauth → www → old → api .reddit.com).
Proxy session rotation — when running with Apify residential proxies (the default), every blocked request retries from a brand-new IP session with a brand-new identity and a fresh rate-limit budget.
Exponential backoff with jitter — retries start at ~2 s and back off up to 60 s (max 5 attempts per request), so transient hiccups never become failed runs.
Per-target fault isolation — one private, banned, or misspelled subreddit never kills your run. Failures are recorded as dataType: "error" items and every other target still delivers.
HTTP-only, no headless browser — runs in 256–512 MB of memory, fast and cheap.

What you can scrape

Mode	Input	What you get
Subreddits	`subreddits` + `sort` / `topPeriod`	Posts from hot / new / top / rising listings — a complete subreddit scraper.
Search	`searchQueries` (+ optional `searchSubreddit`)	Posts matching any keyword across all of Reddit or inside one subreddit — ideal for brand monitoring.
Users	`users` + `userDataType`	Any user's submitted posts, comments, or both.
Start URLs	`startUrls`	Paste any Reddit URL — subreddit, post, user, or search page. The type is auto-detected.
Comments	`includeComments: true`	The full comment tree of every scraped post, flattened into one item per comment — a true Reddit comments scraper.

All modes are combinable in a single run, and maxItems is split fairly across all your targets.

Input

Field	Type	Default	Description
`startUrls`	array	`[]`	Reddit URLs of any type (subreddit / post / user / search) — auto-detected.
`subreddits`	array	`[]`	Subreddit names, with or without `r/`.
`sort`	enum	`new`	`hot`, `new`, `top`, `rising`.
`topPeriod`	enum	`week`	`hour`, `day`, `week`, `month`, `year`, `all` (only for `top`).
`searchQueries`	array	`[]`	Keywords/phrases to search for.
`searchSort`	enum	`relevance`	`relevance`, `hot`, `top`, `new`, `comments`.
`searchSubreddit`	string	`""`	Restrict all searches to one subreddit.
`users`	array	`[]`	Usernames, with or without `u/`.
`userDataType`	enum	`posts`	`posts`, `comments`, or `both`.
`includeComments`	boolean	`false`	Also fetch each post's comment tree.
`maxItems`	integer	`100`	Max posts/user items across all sources combined (budget is split fairly per target).
`maxCommentsPerPost`	integer	`50`	Comment cap per post when `includeComments` is on.
`proxyConfiguration`	object	Apify residential	Proxy settings. Residential strongly recommended.

Example input — monitor two subreddits and a brand keyword, with comments:

{
    "subreddits": ["programming", "smallbusiness"],
    "sort": "new",
    "searchQueries": ["apify"],
    "includeComments": true,
    "maxItems": 60,
    "maxCommentsPerPost": 20
}

Output

One dataset item per post and per comment.

Post (dataType: "post"):

{
    "dataType": "post",
    "id": "1u28egw",
    "subreddit": "programming",
    "url": "https://example.com/article",
    "title": "The new unwritten laws of software engineering",
    "author": "whiskeytown79",
    "selftext": "",
    "score": 1240,
    "upvoteRatio": 0.95,
    "numComments": 312,
    "createdAt": "2026-06-10T17:34:44.000Z",
    "flair": "Discussion",
    "isNsfw": false,
    "mediaUrls": ["https://external-preview.redd.it/..."],
    "permalink": "https://www.reddit.com/r/programming/comments/1u28egw/...",
    "fetchedAt": "2026-06-10T21:12:20.233Z",
    "commentsScraped": 20,
    "moreCommentsSkipped": 12
}

commentsScraped / moreCommentsSkipped appear when includeComments is on — moreCommentsSkipped counts the comments hiding behind Reddit's "load more comments" stubs beyond your per-post cap.

Comment (dataType: "comment") — postId, parentId, and depth let you re-assemble the full thread tree (depth: 0 = top level, where parentId === postId):

{
    "dataType": "comment",
    "id": "oqvw4be",
    "postId": "1u28egw",
    "parentId": "oqvmfr6",
    "depth": 1,
    "subreddit": "programming",
    "author": "vattenpuss",
    "body": "Oh you didn't get the memo? ...",
    "score": 21,
    "createdAt": "2026-06-10T18:02:11.000Z",
    "flair": null,
    "permalink": "https://www.reddit.com/r/programming/comments/1u28egw/.../oqvw4be/",
    "fetchedAt": "2026-06-10T21:12:24.108Z"
}

Error (dataType: "error") — pushed instead of crashing when a single target fails; the run continues and still succeeds:

{
    "dataType": "error",
    "target": "r/some_private_subreddit",
    "error": "Reddit refused access (HTTP 403: private)",
    "fetchedAt": "2026-06-10T21:17:48.974Z"
}

Pricing (pay per event)

You only pay for data you actually receive:

Event	Charged when	Suggested price
`item-scraped`	One post or comment is pushed to the dataset	$0.002

Error items are never charged. Example: 100 posts with 50 comments each = 5,100 items ≈ $10.20; 500 posts without comments ≈ $1.00.

FAQ

Is it legal to scrape Reddit? This actor only collects publicly available data — the same posts and comments anyone can read in a browser without logging in. It collects no private data and accesses no user accounts. As always, consult your own counsel for your specific use case and respect Reddit's User Agreement when republishing content.

Do I need a Reddit account, API key, or cookies? No. The actor uses the same anonymous identity Reddit's own website creates for every logged-out visitor. There is nothing to configure, nothing to expire, and no account that can be banned.

How is this different from other Reddit scrapers? Reliability. The most popular Reddit actors fail a large share of their runs because they treat Reddit's rate limits and bot-wall responses as fatal errors. This actor paces itself using Reddit's own rate-limit headers, retries with exponential backoff, rotates proxy sessions and fallback hosts automatically, and isolates per-target failures — so one bad subreddit or one throttled request never costs you a run.

Why are some comments missing? Reddit returns large threads partially, hiding deeper branches behind "load more comments" stubs. The actor scrapes up to maxCommentsPerPost comments per post and reports how many remained hidden as moreCommentsSkipped on the post item, so you always know what you got.

Can I monitor subreddits or keywords on a schedule? Yes — add the actor to an Apify Schedule (e.g. every hour with sort: "new") and connect a webhook or one of Apify's integrations (Google Sheets, Slack, Make, Zapier) for an always-on social listening pipeline.

What proxies should I use? Apify residential proxies (the default). Reddit blocks most datacenter IP ranges outright; residential sessions combined with the actor's automatic rotation deliver the reliability this actor is built for.

Tips

For monitoring use cases, sort: "new" + a schedule beats hot — you see every post once, as it appears.
Brand monitoring works best with searchQueries + includeComments: true — the sentiment usually lives in the comments.
Use searchSort: "comments" to find the most discussed posts about a topic.
Keep maxItems aligned with your schedule frequency (e.g. hourly runs rarely need more than 100 items per subreddit).

🟠 Reddit Scraper ✅ Posts, Comments, Search | $1.29/1K

jacquemus/reddit-scraper

Scrape Reddit posts, comments, subreddits, user activity and search results — clean JSON with score, media, flair, and full comment trees. No login needed for your users.

Jacquemus

Reddit Posts + Comments Scraper

softwaresubs29/reddit-posts-comments-scraper

Scrapes Reddit posts and complete comment trees (including 'load more' branches) from subreddits, posts, user profiles, or keyword search. HTTP-only — no browser — for low compute cost.

Shab Codes

👽 Reddit Scraper — Posts, Comments & Search

inexhaustible_glass/reddit-scraper

Scrape Reddit posts, comments, search results & user activity. No login, no API key. Subreddit hot/new/top/rising, keyword search, full comment trees. Auto-paginated. For market research, lead monitoring, brand sentiment & content ideas.

Hitman studio

Reddit Comment Scraper

myagizm/reddit-comment-scraper

Scrape Reddit comments as structured JSON — full comment trees from any post or subreddit. No login, no API key.

MYM

Reddit Scraper - Fast, Comments,Data,Subreddits

myagizm/reddit-scraper

Scrape public Reddit posts, comments, users, communities, and search results as structured JSON. No login, no API key.

MYM

Reddit Search Scraper — Posts, Comments & Users

logiover/reddit-search-scraper

Scrape Reddit subreddit search with no API key or login. Export posts and comments to CSV/JSON — a Reddit API alternative for keyword monitoring.

Logiover

Reddit Scraper

labrat011/reddit-scraper

Scrape Reddit posts, comments, search results, and user profiles. No API keys or browser needed. Supports 4 modes: subreddit posts (hot/new/top/rising), Reddit search, user profiles, and full comment trees. Fast, lightweight HTTP-based scraping with built-in rate limiting and retry logic.

mick_

224

Reddit Scraper

prodiger/reddit-scraper

Extract posts, comments, user profiles, and search results from Reddit. Pure HTTP, no API key required.

Arnas

246

Reddit Search Scraper for Posts

scraply/reddit-posts-search-scraper

Scraply

Reddit Scraper - Posts, Comments, Users & Search

benthepythondev/reddit-scraper

Scrape public Reddit posts, comments, user activity, subreddits, and search results through Reddit's app-only OAuth API. Export nested threads, scores, media, Markdown, and token counts to JSON, CSV, or Excel. No Reddit login or user-supplied API key required.