Pricing

Pay per event

Reddit Subreddit Scraper

Scrape and download posts from any subreddit — hot, new, top, rising, or controversial — export to JSON or CSV. We rotate Firefox/Safari fingerprints, route through residential proxies, and retry on Reddit's 429s. Rows: title, author, score, upvote ratio, comments, NSFW, URL, selftext, posted-at.

Pricing

Pay per event

Rating

0.0

(0)

Developer

DevilScrapes

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

🎯 What this scrapes

Reddit is one of the more aggressively guarded sources on the web — 429 storms, fingerprint checks, IP throttling, and a rate-limited OAuth tier that small teams can't absorb. This reddit subreddit scraper handles all of that for you. We rotate Firefox + Safari TLS fingerprints (Chrome impersonation gets 403'd from datacenter IPs), rotate residential proxy sessions on every block, and walk the subreddits you pick to write one typed row per post. Self-text is preserved as Markdown so downstream pipelines don't need to re-parse anything. You bring the subreddit list; we handle the rest.

🔥 What we handle for you

🛡️ Browser fingerprint rotation — curl-cffi impersonates real Firefox / Safari TLS handshakes so the target sees a genuine browser, not a Python script.
🌐 Residential proxy rotation via Apify Proxy — fresh session and exit IP on every block or rate-limit response.
🔁 Retries with exponential backoff on 408 / 429 / 5xx — up to 5 attempts per page, Retry-After header honoured.
🧱 Rate-limit-aware pacing — when Reddit pushes back, we slow down and resume instead of letting the run die.
🧊 Clean, typed dataset rows — Pydantic-validated, ISO-8601 timestamps, stable IDs, JSON / CSV / Excel export straight from the Apify Console.
💰 Pay-Per-Event pricing — you only pay for results that land in your dataset. No data, no charge.

💡 Use cases

Community monitoring — diff r/<your-product> daily and surface hot threads to your Slack or dashboard before they blow up.
Content discovery — pull the top 100 posts from r/MachineLearning weekly to feed a research digest or newsletter.
NLP corpus building — sweep dozens of hobby subreddits for a labelled training dataset; combine with the sibling comment-scraper for full thread context.
Brand mention scanning — schedule hourly runs on brand-adjacent subreddits and pipe results into your alerting stack.
Journalism and OSINT — export public-subreddit posts for investigative research or trend analysis, no Reddit OAuth required.
Sentiment baselines — store post score and comment counts over time to build a sentiment time-series for any topic.

⚙️ How to use it

Click Try for free at the top of the page.
Enter one or more subreddit names (without the r/ prefix), pick a listing mode, and set a result cap.
Click Start. Output streams into the run's dataset in real time.
Export from Storage → Dataset as JSON, CSV, or Excel — or pull via the Apify API.

📥 Input

Field	Type	Required	Default	Notes
`subreddits`	`array`	yes	`["programming"]`	List of subreddit names (without the `r/` prefix). One per line. Multiple subreddits are scraped in sequence.
`mode`	`string`	no	`"hot"`	Which Reddit listing to read: `hot`, `new`, `top`, `rising`, or `controversial`.
`timeFilter`	`string`	no	`"day"`	Applies only to `top` and `controversial` modes: `hour`, `day`, `week`, `month`, `year`, or `all`.
`maxResults`	`integer`	no	`100`	Maximum posts to return per subreddit. Reddit caps any listing at 1 000 items. Set `0` to scrape until exhausted (still capped at 1 000).
`includeSelftext`	`boolean`	no	`true`	When `true`, the post body (Markdown) is included for text posts.
`proxyConfiguration`	`object`	no	`{"useApifyProxy": true}`	Apify Proxy config. Reddit rate-limits hot IPs aggressively — residential proxy rotation is the first mitigation we apply on blocks.

Example input

{
  "subreddits": [
    "programming"
  ],
  "mode": "hot",
  "maxResults": 3,
  "includeSelftext": false,
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}

📤 Output

Each run produces a subreddit posts dataset — one item per post, schema-validated and ready to query.

Field	Type	Notes
`id`	`string`	Reddit post fullname (e.g. `t3_abc123`).
`post_id`	`string`	Short id (the `abc123` part).
`subreddit`	`string`	Subreddit name (without `r/`).
`title`	`string`	Post title.
`author`	`string`	Author username, or `[deleted]`.
`url`	`string`	Outbound URL (or the Reddit permalink for self-posts).
`permalink`	`string`	Canonical Reddit permalink (`https://reddit.com/r/.../comments/...`).
`selftext`	`string \| null`	Self-post body (Markdown). Null when `includeSelftext` is false or the post is a link post.
`score`	`integer`	Upvotes minus downvotes.
`upvote_ratio`	`number`	Approximate ratio (0.0 – 1.0).
`num_comments`	`integer`	Comment count.
`over_18`	`boolean`	NSFW flag.
`spoiler`	`boolean`	Marked as spoiler.
`stickied`	`boolean`	Pinned by mods.
`locked`	`boolean`	Comments locked.
`post_hint`	`string \| null`	Reddit's content classification (`image`, `link`, `video`, `self`, etc.).
`flair`	`string \| null`	Post flair text.
`created_utc`	`integer`	Unix timestamp (seconds) — when the post was created.
`posted_at`	`string`	ISO-8601 UTC timestamp derived from `created_utc`.
`scraped_at`	`string`	When this row was recorded by the Actor.

Example output

{
  "id": "t3_1ab2c3d",
  "post_id": "1ab2c3d",
  "subreddit": "programming",
  "title": "An honest critique of the new Rust runtime",
  "author": "u_rustacean",
  "url": "https://example.com/blog/rust-runtime",
  "permalink": "https://reddit.com/r/programming/comments/1ab2c3d/an_honest_critique",
  "score": 1283,
  "upvote_ratio": 0.94,
  "num_comments": 312,
  "over_18": false,
  "created_utc": 1747353600,
  "posted_at": "2026-05-15T20:00:00+00:00",
  "scraped_at": "2026-05-15T20:05:12+00:00"
}

💰 Pricing

Pay-Per-Event — you pay only when these events fire:

Event	USD	What it is
`actor-start`	$0.005	One-off warm-up charge per run
`result`	$0.001	Per dataset item written

Example: 1 000 results at the rates above ≈ $1.00. No subscription, no minimum, no card required to start — Apify gives every new account $5 of free credit.

🚧 Limitations

Reddit's JSON endpoint caps any listing at 1 000 items, including pagination. For deeper post history, the Pushshift archive is the standard alternative (separate Actor — request one if needed).
This Actor scrapes public subreddits only — private or quarantined subreddits are not accessible without credentials, which we do not support.
Comments are a separate Actor — Reddit threads fan out arbitrarily deep. See the reddit-post-comments-scraper sibling for full comment trees.
Real-time firehose coverage is not possible with public listing endpoints. For near-real-time monitoring, schedule the Actor at 5-15 minute intervals.

❓ FAQ

Is this against Reddit's TOS?

Public read-only access via the .json listing endpoints has been the community standard for over a decade. We do not impersonate users, vote, comment, or access any private data. For commercial-scale crawling, Reddit prefers their Data API — sign up at reddit.com/dev/api. We recommend reviewing Reddit's User Agreement for your specific use case.

Is this a reddit api alternative for 2026?

Yes — this Actor is frequently used as a reddit api alternative when teams need structured subreddit data without managing Reddit OAuth, handling rate-limit responses, or dealing with the OAuth tier's review-board overhead. You get the same public post data with none of the credential plumbing.

How do I scrape Reddit without an API key?

To scrape Reddit without API key overhead, use this Actor. You need no Reddit developer account or OAuth credentials — just your Apify account (free tier included). The Actor accesses the same public listing data any logged-out browser can read, with fingerprint rotation and residential proxy rotation applied so you don't hit blocks on the first run.

Why am I seeing incomplete results?

Reddit actively rate-limits high-frequency requests. We apply exponential backoff and rotate proxy sessions on every block, but very aggressive runs (large maxResults across many subreddits in quick succession) can still hit partial results. Lower the result cap per run and schedule several smaller runs for more reliable coverage.

How do I get the top reddit scraper results for a specific time range?

Set mode to top and use the timeFilter input to choose hour, day, week, month, year, or all. This maps directly to Reddit's own top-posts ranking for the chosen window.

Do you scrape comments?

Not in this Actor — comment trees fan out arbitrarily and require different pagination logic. See the reddit-post-comments-scraper sibling Actor for full thread extraction.

Why does the URL match the permalink for some posts?

For self-posts (text posts), Reddit's own url field points back to the post itself rather than an outbound link. This is Reddit's data, not a scraping artefact.

Can I export as CSV or Excel?

Yes. Once the run completes, go to Storage → Dataset in the Apify Console and use the Export button to download JSON, CSV, JSONL, XML, or Excel. You can also fetch the dataset programmatically via the Apify API.

💬 Your feedback

Spotted a bug, hit an unusual edge case, or need an extra field? Open an issue on the Actor's Issues tab in the Apify Console — we ship fixes weekly and read every report.

Reddit User Scraper

devilscrapes/reddit-user-scraper

Scrape any Reddit user's submissions or comments by username — no login — and export to JSON or CSV. A Reddit user downloader returning title, body, subreddit, score, upvotes, timestamp, and permalink per post. We rotate fingerprints and retry on Reddit's 429s.

DevilScrapes

Reddit Scraper — Posts & Comments by Subreddit or Search

hichemdev/reddit-scraper

Scrape Reddit posts and comments from any subreddit or search query: title, author, score, upvote ratio, text, and metadata. No login or API key.

Hichem Ben Moussa

Reddit Subreddit Posts Scraper

xtracto/reddit-subreddit-posts-scraper

Get posts from any public subreddit by sort (hot/new/top/rising/controversial) and time filter. Bulk-paginated.

Farhan Febrian Nauval

Reddit Subreddit Posts Scraper. No Login

seemuapps/reddit-subreddit-posts-scraper

Scrape posts from any public subreddit. Title, author, score, comment count, body text, link, flair, and timestamp. Filter by hot, new, top, rising, or controversial. No login.

Andrew

Reddit Posts Scraper: Full Text, Scores & Awards

scrapers_lat/reddit-posts-scraper

Scrape Reddit posts from any subreddit, search query or post URL. Extract title, selftext, author, score, upvote ratio, comment count, awards, flair and gallery images. Export to JSON, CSV or Excel. No login or API key.

Scrapers Lat

Reddit Subreddit Scraper – Export Posts from Any Subreddit

endspec/reddit-instant-subreddit-scraper

Scrape the newest, hot, top or rising posts from any public subreddit. Export post title, body text, author, score, comment count, timestamp, permalink and URL to JSON, CSV or Excel. No login, no Reddit API key. Pay only per post returned.

EndSpec

Reddit Subreddit Scraper

myagizm/reddit-subreddit-scraper

Scrape posts from any subreddit as structured JSON — new/hot/top/rising, with text and media. No login, no API key.

MYM

Reddit Posts Scraper

scrapesmith/reddit-posts-scraper

Scrape posts from any subreddit on Reddit. Get titles, body text, images, videos, upvotes, comment counts, flair, author info, crossposts, and thumbnails. Sort by hot, new, top, rising, or controversial. Filter by date and NSFW. Deep scrape all sorts at once. No login or cookies needed.

Scrape Smith

Reddit Subreddit Posts Scraper | Fast Bulk Feed Export

clearpath/reddit-subreddit-posts-scraper

Scrape 1,000+ posts per subreddit in seconds. Hot, new, top, rising, controversial feeds with time filters. Optional comment fetching (up to 1,000 per post). Bulk subreddit input via names, URLs, or CSV.

ClearPath

319

Reddit Posts & Comments Scraper

scrapers_lat/reddit-scraper

Scrape Reddit posts and comments from any subreddit or search query. Extract title, author, score, number of comments, subreddit, body text, flair, permalink and timestamps. Export to JSON, CSV or Excel.

Scrapers Lat

5.0