Reddit Pulse
Pricing
from $1.50 / 1,000 post scrapeds
Reddit Pulse
Extract Reddit posts, comments, users, and subreddit metadata at scale — and turn them into structured signals. Find your next SaaS idea, monitor brand mentions, mine pain points, or feed an AI / RAG pipeline. No Reddit API key, no login, no developer registration.
Pricing
from $1.50 / 1,000 post scrapeds
Rating
0.0
(0)
Developer
Henil Mehta
Maintained by CommunityActor stats
1
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
Reddit Scraper — SaaS Idea Finder, Brand Monitor & Subreddit Data Extractor
Extract Reddit posts, comments, users, and subreddit metadata at scale — and turn them into structured signals. Find your next SaaS idea, monitor brand mentions, mine pain points, or feed an AI / RAG pipeline. No Reddit API key, no login, no developer registration.
Built for buyers who want answers, not just rows. Most Reddit scrapers give you raw data. This one ships with two purpose-built modes — SaaS idea-mining and brand mention monitoring — that pre-filter, classify, and score posts so you can act on them in minutes, not hours.
What does this Reddit Scraper do?
This actor crawls Reddit's public JSON endpoints (the same data Reddit's official API exposes, without the OAuth tax) and writes structured rows to an Apify dataset. It scrapes:
- Posts — title, score, upvote ratio, comments count, flair, NSFW flag, thumbnail, full body, permalink
- Comments — full body, score, author, depth, reply count, parent relationship, OP flag
- Users — every post and comment by a username, with metadata
- Subreddit rules — the full ruleset of any community (kind, description, violation reason, priority)
- Subreddit metadata — subscribers, active users, description, banner, icon, creation date, NSFW flag, submission type
- Search results — keyword search across all of Reddit, or restricted to a subreddit list
It runs in five modes, switched by a single input field:
| Mode | What it does | When to use it |
|---|---|---|
| Browse | Pulls posts from a subreddit by sort (Hot / New / Top / Rising) | Generic monitoring, content audits |
| Search | Reddit-wide keyword search with relevance / new / top / comments sort | Finding posts about a topic anywhere |
| SaaS idea-mining | Filters posts to those matching pain phrases ("I wish there was…"), scores demand, classifies intent | Indie hackers, product validation |
| Mention / brand monitor | Tracks multiple keywords side-by-side with sentiment classification | Marketing teams, competitor tracking |
| User profile | Scrapes posts + comments of any username | Influencer audits, lead research |
All modes write to the same dataset, with named views (Overview, Posts, Comments, Ideas, Mentions, Subreddit info, Subreddit rules) so you can switch column presets in the Apify Console without rerunning.
Who is this Reddit Scraper for?
- Indie hackers & founders — find SaaS ideas with real demand signals (intent tier + demand score)
- Marketing & growth teams — monitor brand and competitor mentions across all of Reddit, with sentiment
- AI / ML engineers — assemble RAG and training datasets with structured intent + topic metadata
- Sales teams — find leads asking for tools in your category (buying-intent filter)
- Researchers & journalists — bulk-export discussions, comment trees, and subreddit rules for analysis
- Content creators — surface trending questions and content gaps in your niche
You don't need to write code. You don't need a Reddit developer account. You don't need to manage proxies. Paste an input JSON, click run.
What data can I extract from Reddit?
| Field | Posts | Comments | Users | Communities |
|---|---|---|---|---|
| ID, title, body | ✅ | ✅ (body) | ✅ | ✅ (description) |
| Author, score, upvote ratio | ✅ | ✅ | n/a | n/a |
| Comment count, num replies | ✅ | ✅ | n/a | n/a |
| Creation timestamp | ✅ | ✅ | ✅ | ✅ |
| Permalink, URL | ✅ | ✅ | ✅ | ✅ |
| Flair, NSFW flag, thumbnail | ✅ | n/a | n/a | n/a |
| Reply depth, parent ID | n/a | ✅ | n/a | n/a |
| Removed / deleted detection | ✅ | ✅ | n/a | n/a |
| Subscriber count, active users | n/a | n/a | n/a | ✅ |
| Banner, icon, primary color | n/a | n/a | n/a | ✅ |
| Submission rules (any/link/self) | n/a | n/a | n/a | ✅ |
| Full rules list per community | n/a | n/a | n/a | ✅ |
| Intent tier (idea-mining mode) | ✅ | n/a | n/a | n/a |
| Demand score (idea-mining mode) | ✅ | n/a | n/a | n/a |
| Sentiment (mention monitor) | ✅ | n/a | n/a | n/a |
How much does it cost to scrape Reddit?
Pay only for the results you receive — no subscription, no platform tax.
| Event | Price | Charged when |
|---|---|---|
| Post scraped | $0.0015 | Every post row written to the dataset |
| Comment scraped | $0.0005 | Every comment row (including nested replies) |
| Subreddit ruleset scraped | $0.0003 | Once per subreddit when includeRules is enabled |
| Actor start | $0.00005 | Once per run (first 5 seconds free) |
Example cost calculations
| Run | Cost |
|---|---|
| 1,000 posts, no comments | $1.50 |
| 1,000 posts + 10,000 comments | $6.50 |
| 100 posts + 5,000 comments + 50 subreddit rulesets | $2.67 |
| 50 mention-monitor terms × 25 posts each (1,250 posts) | $1.88 |
| 10 user profiles × 25 posts + 25 comments each (500 rows) | $0.50 |
Why per-event instead of flat per-row? Comments are much cheaper than posts on our pricing, so comment-heavy workloads (sentiment, lead-gen, training data) cost roughly 6× less here than on scrapers that charge a flat $0.003+ per row.
Apify's Free plan gives you $5/month in platform credits, enough to scrape ~3,000 posts before paying anything.
How to scrape Reddit with this actor
- Sign in to Apify (free, no credit card)
- Open this actor's page and click Try for free or Start
- Fill in the input — at minimum, one of:
subreddits(a list of subreddit names) for browse modesearchQueryfor search modeideaMiningMode: truefor SaaS idea mining (uses a curated subreddit list automatically)mentionMonitorMode: true+mentionTermsfor brand monitoringusers(a list of usernames) for user profile mode
- Click Save & Start
- Watch the run log; when complete, open the dataset tab, switch to the view you want, and export JSON / CSV / Excel
The whole flow takes under 60 seconds for a first run.
Input examples
Find SaaS ideas with high demand
{"ideaMiningMode": true,"painPhrasePack": "saas","sort": "new","maxPosts": 100,"excludeRemoved": true}
Uses the curated default subreddit list (SaaS, Entrepreneur, startups, SideProject, indiehackers, SomebodyMakeThis, AppIdeas, microsaas, smallbusiness). Outputs only posts matching pain phrases like "I wish there was…", "looking for a tool…", "would pay for…", with intent tier and demand score.
Monitor brand mentions across Reddit
{"mentionMonitorMode": true,"mentionTerms": ["Notion", "Coda", "Obsidian", "Roam Research"],"sentimentClassification": true,"searchSort": "new","maxPosts": 50}
Tracks four competitors side-by-side. Each row is tagged with matchedTerm and sentiment (positive/negative/neutral/mixed). Schedule daily for ongoing competitor intelligence.
Browse a subreddit with full nested comment trees
{"subreddits": ["MachineLearning", "datascience"],"sort": "top","timeRange": "week","maxPosts": 50,"includeComments": true,"maxCommentsPerPost": 100,"commentDepth": 4,"minCommentScore": 1}
Pulls 50 top posts of the week, walks reply trees up to 4 levels deep, skips downvoted comments.
Audit a Reddit user
{"users": ["spez", "kn0thing"],"userContentType": "both","maxItemsPerUser": 100}
Pulls 100 most recent posts and 100 most recent comments from each user. Use for influencer research, account-takeover monitoring, or sales lead enrichment.
Search a specific niche
{"subreddits": ["SaaS", "Entrepreneur", "SideProject"],"searchQuery": "looking for a tool that","searchRestrictToSubreddits": true,"searchSort": "new","maxPosts": 25}
Search restricted to your subreddit list — perfect for finding buyer-intent posts in a specific community.
Export subreddit rules + metadata
{"subreddits": ["python", "rust", "golang"],"maxPosts": 0,"includeRules": true,"includeSubredditInfo": true}
No posts, just the community ruleset + about-page data. Useful before scheduling automated submissions or compliance audits.
Output examples
Example post (browse / search mode)
{"type": "post","subreddit": "MachineLearning","id": "1abc234","title": "[D] What papers stood out this week?","author": "researcher_42","score": 1247,"upvoteRatio": 0.96,"numComments": 183,"createdAt": "2026-05-12T08:14:22.000Z","url": "https://arxiv.org/abs/2401.12345","permalink": "https://www.reddit.com/r/MachineLearning/comments/1abc234/...","selftext": "I've been reading...","flair": "Discussion","over18": false,"domain": "arxiv.org","isRemoved": false,"isDeleted": false,"removedBy": null}
Example post (SaaS idea-mining mode)
{"type": "post","subreddit": "SaaS","title": "Looking for a tool that helps me track competitor pricing","author": "indiebuilder","score": 47,"numComments": 23,"intentTier": "request","painPhrase": "looking for a tool","demandScore": 64.3,"permalink": "https://www.reddit.com/r/SaaS/comments/...","createdAt": "2026-05-15T11:02:00.000Z"}
Example post (mention monitor mode)
{"type": "post","subreddit": "productivity","title": "Switched from Notion to Obsidian, my honest experience","matchedTerm": "Obsidian","sentiment": "positive","score": 312,"numComments": 89,"permalink": "https://www.reddit.com/r/productivity/comments/..."}
Example comment (nested)
{"type": "comment","subreddit": "MachineLearning","postId": "1abc234","postTitle": "[D] What papers stood out this week?","commentId": "j12abcd","author": "ml_researcher","body": "The Mamba paper completely changed how I think about state-space models...","score": 156,"depth": 1,"replyCount": 3,"isSubmitter": false,"parentId": "t1_j11xyz","createdAt": "2026-05-12T09:45:00.000Z","isRemoved": false}
Example subreddit rules
{"type": "rules","subreddit": "MachineLearning","fetchedAt": "2026-05-16T13:27:53.651Z","rulesCount": 6,"rules": [{"kind": "link","shortName": "Be respectful","description": "No personal attacks, hate speech, or harassment.","violationReason": "Disrespectful behavior","priority": 0,"createdAt": "2022-05-19T17:26:33.000Z"}],"siteRules": ["Spam", "Personal and confidential information"]}
Example subreddit metadata
{"type": "subreddit","subreddit": "Python","displayName": "Python","subscribers": 1479405,"activeUserCount": 4203,"publicDescription": "News about the Python programming language.","isNsfw": false,"lang": "en","submissionType": "self","allowVideos": true,"allowImages": true,"allowPolls": false,"bannerImg": "https://styles.redditmedia.com/...","iconImg": "https://styles.redditmedia.com/...","primaryColor": "#3776ab","createdAt": "2008-01-25T03:15:11.000Z","url": "https://www.reddit.com/r/Python/"}
Input parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
subreddits | string[] | [] | Subreddit names without r/ |
sort | enum | hot | hot, new, top, rising |
timeRange | enum | day | For sort=top: hour, day, week, month, year, all |
maxPosts | int | 25 | Per subreddit / per search / per term (1–1000) |
includeComments | bool | false | Fetch comments for each post |
maxCommentsPerPost | int | 50 | Cap on comments per post (across all depths) |
commentDepth | int | 3 | Max reply nesting depth (1=top-level only, up to 10) |
minCommentScore | int | -1000 | Skip comments below this score |
includeRules | bool | false | Export each subreddit's full ruleset |
includeSubredditInfo | bool | false | Export each subreddit's about page |
searchQuery | string | — | Run in search mode |
searchSort | enum | relevance | relevance, new, top, comments, hot |
searchRestrictToSubreddits | bool | false | Search inside subreddits only |
excludeRemoved | bool | false | Skip removed/deleted posts and comments |
nsfwFilter | enum | include | include, exclude, only |
ideaMiningMode | bool | false | Activate SaaS idea-mining mode |
painPhrasePack | enum | saas | saas, leads, feature-request, custom |
customPainPhrases | string[] | — | Used when painPhrasePack=custom |
users | string[] | — | Reddit usernames to scrape |
userContentType | enum | both | posts, comments, both |
maxItemsPerUser | int | 25 | Cap per user (1–1000) |
mentionMonitorMode | bool | false | Activate mention monitor mode |
mentionTerms | string[] | — | Keywords to track |
sentimentClassification | bool | false | Tag posts with positive/negative/neutral/mixed |
proxyConfiguration | object | residential | Apify Proxy config — residential recommended |
Dataset views in Apify Console
Open the dataset tab in the Apify Console and switch between these column presets without rerunning:
- Overview — type, subreddit, title, author, score, created, permalink (mixed)
- Posts — title, author, score, upvote%, comments, flair, NSFW, created, permalink
- Comments — postTitle, depth, author, body, score, replies, OP flag
- Ideas — title, intent tier, matched phrase, demand score, comments, link
- Mentions — matched term, sentiment, subreddit, title, score, permalink
- Subreddit info — displayName, subscribers, active, NSFW, language, description
- Subreddit rules — subreddit, rulesCount, rules array, site-wide rules
Views filter columns, not rows. Sort by the type column to group post / comment / rules / subreddit rows in any view.
Use cases
Find your next SaaS idea
Use ideaMiningMode with painPhrasePack: "saas" on r/SaaS, r/Entrepreneur, r/SideProject, r/IndieHackers. Sort the dataset by demandScore descending. The top 10 rows are validated pain points with real engagement — each one is a potential product.
Monitor brand and competitor mentions
Schedule a daily run with mentionMonitorMode + your brand + 3 competitor names + sentimentClassification. Pipe results to Slack via Apify integrations. You'll know within 24h when a competitor takes a reputation hit.
Lead generation
Use painPhrasePack: "leads" (or custom phrases like "looking for an agency", "willing to pay") to find buyer-intent posts in your category. Filter to intentTier: "buying" for the warmest leads.
AI / RAG training data
Combine searchQuery with includeComments: true and high commentDepth to assemble topic-specific datasets. Each row already has structured metadata (subreddit, score, depth, parent) ready for vector embeddings.
Compliance & community moderation audits
Use includeRules: true and includeSubredditInfo: true to export the full posting guidelines + submission types for every subreddit in a list. Useful before automated outreach campaigns or content syndication.
Influencer & user research
Use users: ["username1", "username2"] to pull complete histories. Combine with sentiment classification for brand advocacy mapping.
Tips for scaling and Reddit's limits
Reddit's 1,000-item limit
Reddit's public JSON caps any single listing (subreddit posts, search results, user history) at ~1,000 items. To get more:
- Combine sort modes — Hot + New + Top often returns different posts
- Use time-range slicing —
sort=top+timeRange=hourthendaythenweekcovers different windows - Search with multiple keywords — each query has its own 1,000 ceiling
- Schedule incremental runs — scrape recent posts daily and deduplicate by post
idin your downstream pipeline
The 1,000 cap does not apply to comments — you can pull complete comment trees from any single post.
Cost optimization
- Set
excludeRemoved: trueto skip dead rows (saves per-event charges) - Set
minCommentScore: 1to skip downvoted noise - Lower
commentDepthto 1–2 if you only care about top-level discussion - Use
nsfwFilter: "exclude"for B2B / safe-content use cases
Integration
- API — run from any code, with
runActorand webhook callbacks - Schedule — Apify's native scheduler runs this on cron
- Webhooks — push completed runs to your stack instantly
- Integrations — Apify connects to Google Sheets, Airtable, Slack, Discord, Make, Zapier, and more
FAQ
Is scraping Reddit legal?
Scraping public Reddit data is generally permissible — Reddit's content is publicly accessible. This actor only fetches public JSON endpoints that anyone with a browser can view. We do not bypass private subreddits, modmail, or any auth-gated content. You are responsible for complying with Reddit's Terms of Service and your local data-protection laws (e.g., handling personal data under GDPR).
Do I need a Reddit API key or developer account?
No. This actor uses Reddit's public JSON endpoints (the same data the website renders), so no OAuth, no app registration, no rate-limit token. Especially useful after Reddit's 2023 API pricing changes pushed the official API out of reach for most use cases.
How does this compare to other Reddit scrapers on the marketplace?
- Per-content-type pricing — comments cost $0.0005 here vs flat per-row pricing elsewhere; if your use case is comment-heavy, you'll pay roughly 6× less
- Nested comment trees with depth and score filters, not just top-level
- Two purpose-built modes (idea-mining, mention monitor) you won't find on generic scrapers
- Removed-content detection so you can filter or audit moderator removals
- Seven dataset views instead of one flat output table
Can it scrape private or quarantined subreddits?
No. Private subreddits require Reddit login + invite. Quarantined subreddits require a logged-in opt-in click. This actor only accesses publicly visible data.
How fresh is the data?
Real-time at scrape time. The actor hits Reddit's live JSON; there's no cache or staging.
Can I scrape more than 1,000 posts from a subreddit?
Not in a single listing — that's Reddit's platform limit. See the "Tips for scaling" section above for workarounds (sort combinations, time slicing, incremental runs).
What about rate limits?
The actor uses Apify's residential proxy rotation by default and randomizes user agents. With default settings you can scrape thousands of posts per run reliably. If you hit transient errors, lower maxConcurrency in the code or your input.
Can I get notified when a new mention shows up?
Yes — schedule the actor with mentionMonitorMode to run hourly or daily, then add a webhook in the actor's integrations tab. Send the run-finished event to Slack, Zapier, or your own endpoint and react instantly.
Does it work for sentiment analysis at scale?
The built-in sentimentClassification is a fast keyword-heuristic suitable for filtering and dashboards. For high-stakes sentiment grading, pipe the raw body / selftext to an LLM downstream — the actor outputs already include structured metadata (subreddit, score, depth, intent tier) that improves LLM accuracy.
How do I export to CSV / Excel?
After a run, open the dataset tab in the Apify Console and click Export. CSV, Excel, JSON, JSONL, XML, RSS, and HTML are all supported natively by Apify.
What output schema should I expect?
Every row has a type discriminator: post, comment, rules, or subreddit. The full field schema is documented in the actor's output schema tab in the Apify Console — or see the JSON examples above.
Can I run this as an API?
Yes. Apify exposes every actor as an API endpoint — call POST /v2/acts/{actorId}/runs with your input as JSON, then poll or webhook for the dataset URL. No SDK required.
Changelog
- v0.10 — Added subreddit metadata (
includeSubredditInfo), mention monitor mode (mentionMonitorMode,mentionTerms,sentimentClassification) - v0.8 — User profile scraping (
users,userContentType,maxItemsPerUser) - v0.7 — SaaS idea-mining mode (
ideaMiningMode,painPhrasePack, demand scoring, intent tiers) - v0.6 — NSFW filter (
nsfwFilter) - v0.5 — Removed / deleted post + comment detection (
isRemoved,isDeleted,removedBy,excludeRemoved) - v0.4 — Nested comment tree with depth + min-score filters (
commentDepth,minCommentScore) - v0.3 — Reddit-wide search (
searchQuery,searchSort,searchRestrictToSubreddits) - v0.2 — Subreddit rules export (
includeRules) - v0.1 — Initial release: posts, top-level comments, bulk subreddit scraping
Support
- Open an issue on this actor's Issues tab in the Apify Console
- Issue response target: under 24 hours
- For custom modifications or higher-volume contracts, contact via Apify Console messaging