Reddit Scraper
Pricing
from $2.50 / 1,000 posts
Reddit Scraper
Scrape Reddit posts, threads, and comments from any subreddit, search, or user — clean structured JSON, fast.
Pricing
from $2.50 / 1,000 posts
Rating
0.0
(0)
Developer
Always Prime
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
🚀 Reddit Scraper — every post, comment & thread, as clean JSON
Pull structured Reddit data at speed — posts, comments, scores, flairs, awards, media, timestamps. No login. No code. No babysitting.
🏠 Subreddits · 🔍 Keyword search · 👤 User submissions/comments · 🔗 Custom URLs — all four sources, one input form.
⚡️ Why this scraper
- 🎯 50+ fields per post — full title and body, score breakdown, upvote ratio, flair, awards, removal status, media URLs, edit timestamps. Nothing dropped on the floor.
- 💬 Comment threads on demand — flip one switch and get the full comment tree per post, threaded via
parent_idanddepth. - 🚄 Fast — ~3 posts/second steady-state on default settings; ~250ms median per detail fetch.
- 🧠 Smart pagination — stops the moment your
Max itemsbudget is reached. Never over-fetches, never wastes Apify Compute Units. - 🔁 Incremental mode — pass a
sincetimestamp and only get posts newer than your last run. Perfect for daily monitoring jobs. - 🛡️ Built-in failure budget — if Reddit starts pushing back (challenges, hard 4xx), the actor aborts cleanly instead of burning through your CU on a broken extractor.
- 📊 Three export formats out of the box — JSON, CSV, Excel. Direct download links from the run page.
🚀 Quick start
- Click Try for free (top-right). No code, no API key.
- Pick a search type — Subreddit, Search, User, or paste your own URLs.
- Hit Start and let it run.
- Download as JSON / CSV / Excel from the run page.
📥 Input
| Field | Type | Description |
|---|---|---|
What to scrape (searchType) | enum | subreddit · search · user · urls |
Subreddits (subreddits) | string list | e.g. python, programming (no r/ prefix) |
Search query (query) | string | Keywords. Reddit operators work: author:, subreddit:, self:yes, flair:. |
Users (users) | string list | Usernames to scrape (no u/ prefix) |
User content type (userContent) | enum | submitted (posts) or comments |
Sort by (sortBy) | enum | hot · new · top · rising · controversial · relevance · comments |
Time window (time) | enum | hour · day · week · month · year · all (only matters for top/controversial) |
Max items (maxItems) | int | Stop after N posts. 0 = unlimited. Default 50. |
Scrape comments (scrapeComments) | bool | Fetch the comment tree for every post. Default off (cheaper for indexing). |
Max comments per post (commentDepth) | int | Cap on comments per post (BFS). Default 200. |
Only posts newer than (since) | datetime | ISO 8601 cutoff for incremental runs. |
Concurrency (concurrency) | int | Parallel fetches. Default 5, max 25. |
Start URLs (startUrls) | string list | Advanced override — paste any reddit URLs and ignore the search-type builder. |
📦 Sample output
One record per post — flat, JSON-friendly, ready to load into BigQuery / Postgres / pandas.
{"id": "1t3x7ba","fullname": "t3_1t3x7ba","url": "https://www.reddit.com/r/Python/comments/1t3x7ba/whos_going_to_pycon_us_next_week/","subreddit": "Python","subreddit_prefixed": "r/Python","subreddit_id": "t5_2qh0y","title": "Who's going to PyCon US next week?","selftext": "Me ✋ I hope to see a good number of you all in Long Beach, too! ...","is_self": true,"domain": "self.Python","post_hint": "self","link_url": null,"author": "Loren-PSF","author_fullname": "t2_so0s40st","author_flair_text": ":pythonLogo: Python Software Foundation Staff","distinguished": null,"score": 46,"ups": 46,"upvote_ratio": 0.91,"num_comments": 35,"num_crossposts": 0,"total_awards_received": 0,"gilded": 0,"over_18": false,"spoiler": false,"locked": false,"stickied": true,"archived": false,"is_video": false,"is_original_content": false,"link_flair_text": "Discussion","link_flair_css_class": "discussion","link_flair_background_color": "#f50057","thumbnail": null,"preview_image_url": "https://external-preview.redd.it/FBtD3iI-OdRHdmfJbVushiwzLeMcmgTx-Ff3FnwUUg0.jpeg","video_url": null,"removed_by_category": null,"removal_reason": null,"created_at": "2026-05-04T22:40:29+00:00","edited_at": null,"scraped_at": "2026-05-09T13:43:47+00:00","comments": [{"id": "myz2pn1","parent_id": "t3_1t3x7ba","depth": 0,"author": "vintagegeek","body": "I'll be there with bells on. Looking forward to meeting people!","score": 19,"is_submitter": false,"stickied": false,"permalink": "https://www.reddit.com/r/Python/comments/1t3x7ba/.../myz2pn1/","created_at": "2026-05-04T23:01:14+00:00","edited_at": null}],"comments_count_scraped": 35}
💡 Use cases
| Who | What for |
|---|---|
| 📈 Market researchers | Track sentiment, competitor mentions and product feedback across niche subreddits. |
| 🤖 AI / ML teams | Build training corpora from focused subreddits — clean text, threading preserved. |
| 📰 Journalists & analysts | Monitor breaking-story subreddits and surface trending discussions for coverage. |
| 💼 Brand / community managers | Find unanswered support questions about your product across Reddit, on a daily cron. |
| 🏷️ Recruiters & talent intel | Pull discussions in tech-job subreddits to track skill demand and salary chatter. |
| 🧑🔬 Academic researchers | Public-discourse datasets for sociolinguistics, network analysis, opinion mining. |
🧰 Tips & tricks
- 🪶 Index-first, hydrate later. Run with
scrapeComments: falseandmaxItems: 0to cheaply enumerate everything. Then a second run withstartUrlsandscrapeComments: trueonly on the posts you care about. - ⏱️ Daily diffs. Save the timestamp of your last successful run, then pass it as
sincenext time. The actor short-circuits old posts before fetching them. - 🎛️ Subreddit-scoped search. Set
searchType: search, fillquery, and add subreddits tosubreddits— the actor automatically scopes search to those subreddits. - 🔗 Mix custom URLs. Drop any
reddit.com/...URL intostartUrls(a thread, a multireddit, a sort variant) — the actor strips/appends.jsonitself.
❓ FAQ
Does it need a Reddit account? No.
What about the new Reddit API limits? This actor doesn't use Reddit's Data API, so the post-2023 commercial pricing tiers don't apply.
Can I scrape NSFW subreddits? Yes. NSFW posts are returned with over_18: true so you can filter downstream.
Will it get all comments on a huge thread? Up to your commentDepth cap (default 200, max 5000), breadth-first across the tree. For Reddit's truly massive megathreads (>10K comments), Reddit itself paginates and not every comment is reachable in one fetch — that's a Reddit limitation, not the scraper's.
What if a post is deleted while scraping? Deleted posts come through with author: "[deleted]", selftext: "[deleted]", and removed_by_category: "deleted". They're not skipped — you get the metadata Reddit still surfaces.
How fresh is the data? Real-time. Each record carries a scraped_at UTC timestamp.
📅 Changelog
0.1 (initial release)
- Subreddit, search, user, and start-URL modes
- Configurable comment-tree scraping with depth cap
- Incremental
sincefilter,maxItemscap, dedup, failure budget - JSON / CSV / Excel exports
⚖️ Legal
This scraper accesses Reddit through public, non-authenticated requests. Reddit's robots.txt disallows automated crawling, and Reddit's User Agreement and Public Content Policy restrict automated/commercial use of Reddit content. By using this scraper you take on responsibility for the legality of your specific use case in your jurisdiction (including GDPR / CCPA where applicable). The scraper does not bypass authentication, paywalls, or technical access controls. Use it for research, journalism, internal analytics, ML/AI training datasets, or other lawful purposes — and confirm that those purposes are compatible with Reddit's policies and any applicable law before running large-scale jobs. Personal data scraped from Reddit (usernames, comment bodies, flair) may constitute PII under GDPR even though usernames are pseudonymous; treat the output dataset accordingly.