Reddit Scraper V1 — Subreddit Feeds, Posts, Comments (4)
Pricing
from $1.99 / 1,000 results
Reddit Scraper V1 — Subreddit Feeds, Posts, Comments (4)
Scrape Reddit posts and comments by URL or subreddit name. No Reddit account or OAuth required.
Pricing
from $1.99 / 1,000 results
Rating
0.0
(0)
Developer
Red Crawler
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
6 hours ago
Last modified
Categories
Share
Reddit Scraper — Subreddit Feeds, Posts, Comments
Scrape Reddit posts and comments by URL or subreddit name. Four self-contained endpoints — pull a subreddit's feed, a single post's full payload, a post's full comment tree, or a single comment's metadata. No Reddit account, OAuth, or proxy required.
Pick the endpoint, fill the matching section, hit Start.
Endpoints at a glance
| # | Endpoint | Records returned | Best for |
|---|---|---|---|
| 1 | Scrape Posts | up to 1000 posts (subreddit feed) | niche monitoring, daily snapshots, RSS-style feeds |
| 2 | Post Detail | 1 record (one post) | refreshing a single post, importing a thread |
| 3 | Scrape Comments | up to 5000 (or uncapped) | sentiment, archives, megathread research |
| 4 | Comment Detail | 1 record (one comment) | quoting, refreshing one comment |
Every endpoint accepts URLs, prefixed fullnames, or raw IDs:
| Entity | Examples |
|---|---|
| post | https://www.reddit.com/r/Wordpress/comments/1s4a4j6/ · t3_1s4a4j6 · 1s4a4j6 |
| comment | https://www.reddit.com/r/.../comment/lwbnv0t/ · t1_lwbnv0t · lwbnv0t |
| subreddit | AskReddit · r/AskReddit · /r/AskReddit · full subreddit URL |
What you can fetch
1. Scrape Posts — subreddit feed
Pulls a subreddit's post feed and streams pages so records appear in the dataset within seconds. Pages are fetched in 100s and stitched together up to your limit.
Inputs
| Field | Type | Default | Notes |
|---|---|---|---|
subreddit | string | AskReddit | Subreddit name (without r/). |
sort | enum | hot | best / hot / new / top / rising / controversial. |
time | enum | (none) | Only used when sort is top / controversial. hour … all. |
limit | int | 25 | 1 – 1000. |
Returns per post — Reddit ID, fullname, title, body / selftext, author, subreddit, score, ups / downs / upvote ratio, comment count, crosspost count, created + edited timestamps, permalink, external URL, domain, post-type flags (is_self, is_video, over_18, spoiler, locked, stickied, pinned, archived), distinguished status, removal category, link & author flair, thumbnail, media (images / video / gallery), awards.
Use it when — niche monitoring, daily community snapshots, content syndication (r/programming hot → RSS), bulk research, competitor watching.
2. Post Detail
Full payload of a single post.
Inputs
| Field | Type | Notes |
|---|---|---|
post | string | URL, t3_ fullname, or raw ID. |
Returns — same rich post record as Scrape Posts.
Use it when — single-post deep dive, refreshing one record after an edit, importing a single thread.
3. Scrape Comments — post comment tree
Comments under a single post, with control over how the tree is traversed.
Inputs
| Field | Type | Default | Notes |
|---|---|---|---|
post | string | (required) | URL, t3_, or raw ID. |
sort | enum | top | best / top / new / controversial / old / qa. |
mode | enum | custom | custom (capped), top_level (top-level only), all (uncapped). |
count | int | 100 | 1 – 5000. Used by custom mode. |
Returns per comment — ID, fullname, parent post / parent comment IDs, author, body (markdown + HTML), score / ups / downs / controversiality, created + edited timestamps, permalink, OP flag (is_submitter), depth, stickied / distinguished / locked / archived / score-hidden flags, subreddit, awards.
Use it when — sentiment analysis, comment archives, support-ticket mining, debate / megathread research, training data.
4. Comment Detail
Full payload of a single comment.
Inputs
| Field | Type | Notes |
|---|---|---|
comment | string | URL, t1_ fullname, or raw ID. |
Returns — same rich comment record as Scrape Comments.
Use it when — pulling a quoted comment, refreshing one record after edits, citation tooling.
How to run
- Pick an endpoint in the "What to fetch" dropdown.
- Open the matching section and fill its fields. Each section is independent — fields outside your chosen section are ignored.
- Click Start.
Default subreddit is AskReddit and default test post is the public WordPress post — the actor runs out of the box.
Output
Results are pushed to the actor's default dataset. View as a table or download as JSON / CSV / Excel / XML.
| Endpoint | Rows pushed |
|---|---|
| Scrape Posts | up to limit posts |
| Post Detail | 1 record |
| Scrape Comments | up to count (or uncapped if mode=all) |
| Comment Detail | 1 record |
Every record carries an endpoint field. Most useful columns (id, title, score, created, etc.) are placed first so the Table view is readable without scrolling.
Status & error reference
Run status (Apify-side, shown on the run page)
| Apify UI cue | Status | Apify message | Meaning | What to do |
|---|---|---|---|---|
| green check | SUCCEEDED | "Actor succeeded with N results in the dataset" | Run finished. Some or zero results pushed. | Open the dataset. |
| red exclamation | FAILED | "The Actor process failed…" | Validation error or upstream Reddit fault. | Check the run log. You are NOT charged. |
| red clock | TIMED-OUT | "The Actor timed out…" | Run exceeded its timeout. | Re-run; consider lowering limit or using mode=custom. |
| red square outline | ABORTED | "The Actor process was aborted…" | You stopped the run manually. | No charge for unpushed results. |
Common in-run conditions (visible in run log)
| Condition | Cause | Result |
|---|---|---|
| Empty result set | Subreddit empty / banned / private. | Run SUCCEEDED, 0 records, no charge. |
| Subreddit feed cap | Asked for more than ~1000 posts. | Run SUCCEEDED, capped at Reddit's pagination limit. |
| Removed post stub | Post was removed; metadata still partial. | Run SUCCEEDED, returns stub with removed_by_category. |
qa sort fallback | qa sort outside QA-mode subs. | Run SUCCEEDED, falls back to top. |
Validation error: post required | Missing post field on Detail / Comments. | Run FAILED immediately, no charge. |
Common edge cases
- Removed / deleted posts return whatever metadata Reddit still exposes — often a stub with
removed_by_category. - Private / quarantined subreddits return zero records.
- Subreddit feed cap — Reddit caps subreddit feed pagination at ~1000 unique posts. Higher
limitwon't return more. - Comments
allmode is uncapped — long threads (10k+ comments) hit Reddit's tree size limit before our cap. - Comment
qasort — only meaningful in QA-mode subreddits; falls back totopelsewhere. - NSFW content — fully supported; the
over_18flag tells you if a post is age-gated.
Why this actor is fast
- Speed — 1–3 seconds per call, end-to-end. Pure HTTP to Reddit's API. No browser to boot, no Playwright / Selenium / Puppeteer overhead. Competing browser-based scrapers typically take 15–60 seconds per call.
- Reliability — zero browser flakiness. No headless-Chromium crashes. No JS-render timeouts. No captcha pages. No surprise mid-run failures from a browser quirk.
- Footprint — under 100 MB RAM per run. Most browser-based scrapers need 1–4 GB. We're a thin async dispatcher — Reddit auth, proxy rotation, retry, and GraphQL handling all happen off-actor on our backend.
Pricing
Pay-per-result. You're only charged for records actually pushed to the dataset.
| Outcome | Charged? |
|---|---|
SUCCEEDED with results | Yes — per record pushed. |
SUCCEEDED with zero records | No. |
FAILED (validation / upstream) | No. |
ABORTED | Only for records already pushed before you stopped. |
See the actor's Pricing tab for the current per-result rate.