Pricing

from $1.50 / 1,000 post scrapeds

Reddit Pulse

Extract Reddit posts, comments, users, and subreddit metadata at scale — and turn them into structured signals. Find your next SaaS idea, monitor brand mentions, mine pain points, or feed an AI / RAG pipeline. No Reddit API key, no login, no developer registration.

Pricing

from $1.50 / 1,000 post scrapeds

Rating

0.0

(0)

Developer

Henil Mehta

Actor stats

Bookmarked

Total users

Monthly active users

3 months ago

Last modified

Reddit Scraper — SaaS Idea Finder, Brand Monitor & Subreddit Data Extractor

Built for buyers who want answers, not just rows. Most Reddit scrapers give you raw data. This one ships with two purpose-built modes — SaaS idea-mining and brand mention monitoring — that pre-filter, classify, and score posts so you can act on them in minutes, not hours.

What does this Reddit Scraper do?

This actor crawls Reddit's public JSON endpoints (the same data Reddit's official API exposes, without the OAuth tax) and writes structured rows to an Apify dataset. It scrapes:

Posts — title, score, upvote ratio, comments count, flair, NSFW flag, thumbnail, full body, permalink
Comments — full body, score, author, depth, reply count, parent relationship, OP flag
Users — every post and comment by a username, with metadata
Subreddit rules — the full ruleset of any community (kind, description, violation reason, priority)
Subreddit metadata — subscribers, active users, description, banner, icon, creation date, NSFW flag, submission type
Search results — keyword search across all of Reddit, or restricted to a subreddit list

It runs in five modes, switched by a single input field:

Mode	What it does	When to use it
Browse	Pulls posts from a subreddit by sort (Hot / New / Top / Rising)	Generic monitoring, content audits
Search	Reddit-wide keyword search with relevance / new / top / comments sort	Finding posts about a topic anywhere
SaaS idea-mining	Filters posts to those matching pain phrases ("I wish there was…"), scores demand, classifies intent	Indie hackers, product validation
Mention / brand monitor	Tracks multiple keywords side-by-side with sentiment classification	Marketing teams, competitor tracking
User profile	Scrapes posts + comments of any username	Influencer audits, lead research

All modes write to the same dataset, with named views (Overview, Posts, Comments, Ideas, Mentions, Subreddit info, Subreddit rules) so you can switch column presets in the Apify Console without rerunning.

Who is this Reddit Scraper for?

Indie hackers & founders — find SaaS ideas with real demand signals (intent tier + demand score)
Marketing & growth teams — monitor brand and competitor mentions across all of Reddit, with sentiment
AI / ML engineers — assemble RAG and training datasets with structured intent + topic metadata
Sales teams — find leads asking for tools in your category (buying-intent filter)
Researchers & journalists — bulk-export discussions, comment trees, and subreddit rules for analysis
Content creators — surface trending questions and content gaps in your niche

You don't need to write code. You don't need a Reddit developer account. You don't need to manage proxies. Paste an input JSON, click run.

What data can I extract from Reddit?

Field	Posts	Comments	Users	Communities
ID, title, body	✅	✅ (body)	✅	✅ (description)
Author, score, upvote ratio	✅	✅	n/a	n/a
Comment count, num replies	✅	✅	n/a	n/a
Creation timestamp	✅	✅	✅	✅
Permalink, URL	✅	✅	✅	✅
Flair, NSFW flag, thumbnail	✅	n/a	n/a	n/a
Reply depth, parent ID	n/a	✅	n/a	n/a
Removed / deleted detection	✅	✅	n/a	n/a
Subscriber count, active users	n/a	n/a	n/a	✅
Banner, icon, primary color	n/a	n/a	n/a	✅
Submission rules (any/link/self)	n/a	n/a	n/a	✅
Full rules list per community	n/a	n/a	n/a	✅
Intent tier (idea-mining mode)	✅	n/a	n/a	n/a
Demand score (idea-mining mode)	✅	n/a	n/a	n/a
Sentiment (mention monitor)	✅	n/a	n/a	n/a

How much does it cost to scrape Reddit?

Pay only for the results you receive — no subscription, no platform tax.

Event	Price	Charged when
Post scraped	$0.0015	Every post row written to the dataset
Comment scraped	$0.0005	Every comment row (including nested replies)
Subreddit ruleset scraped	$0.0003	Once per subreddit when `includeRules` is enabled
Actor start	$0.00005	Once per run (first 5 seconds free)

Example cost calculations

Run	Cost
1,000 posts, no comments	$1.50
1,000 posts + 10,000 comments	$6.50
100 posts + 5,000 comments + 50 subreddit rulesets	$2.67
50 mention-monitor terms × 25 posts each (1,250 posts)	$1.88
10 user profiles × 25 posts + 25 comments each (500 rows)	$0.50

Why per-event instead of flat per-row? Comments are much cheaper than posts on our pricing, so comment-heavy workloads (sentiment, lead-gen, training data) cost roughly 6× less here than on scrapers that charge a flat $0.003+ per row.

Apify's Free plan gives you $5/month in platform credits, enough to scrape ~3,000 posts before paying anything.

How to scrape Reddit with this actor

Sign in to Apify (free, no credit card)
Open this actor's page and click Try for free or Start
Fill in the input — at minimum, one of:
- subreddits (a list of subreddit names) for browse mode
- searchQuery for search mode
- ideaMiningMode: true for SaaS idea mining (uses a curated subreddit list automatically)
- mentionMonitorMode: true + mentionTerms for brand monitoring
- users (a list of usernames) for user profile mode
Click Save & Start
Watch the run log; when complete, open the dataset tab, switch to the view you want, and export JSON / CSV / Excel

The whole flow takes under 60 seconds for a first run.

Input examples

Find SaaS ideas with high demand

{
  "ideaMiningMode": true,
  "painPhrasePack": "saas",
  "sort": "new",
  "maxPosts": 100,
  "excludeRemoved": true
}

Uses the curated default subreddit list (SaaS, Entrepreneur, startups, SideProject, indiehackers, SomebodyMakeThis, AppIdeas, microsaas, smallbusiness). Outputs only posts matching pain phrases like "I wish there was…", "looking for a tool…", "would pay for…", with intent tier and demand score.

Monitor brand mentions across Reddit

{
  "mentionMonitorMode": true,
  "mentionTerms": ["Notion", "Coda", "Obsidian", "Roam Research"],
  "sentimentClassification": true,
  "searchSort": "new",
  "maxPosts": 50
}

Tracks four competitors side-by-side. Each row is tagged with matchedTerm and sentiment (positive/negative/neutral/mixed). Schedule daily for ongoing competitor intelligence.

Browse a subreddit with full nested comment trees

{
  "subreddits": ["MachineLearning", "datascience"],
  "sort": "top",
  "timeRange": "week",
  "maxPosts": 50,
  "includeComments": true,
  "maxCommentsPerPost": 100,
  "commentDepth": 4,
  "minCommentScore": 1
}

Pulls 50 top posts of the week, walks reply trees up to 4 levels deep, skips downvoted comments.

Audit a Reddit user

{
  "users": ["spez", "kn0thing"],
  "userContentType": "both",
  "maxItemsPerUser": 100
}

Pulls 100 most recent posts and 100 most recent comments from each user. Use for influencer research, account-takeover monitoring, or sales lead enrichment.

Search a specific niche

{
  "subreddits": ["SaaS", "Entrepreneur", "SideProject"],
  "searchQuery": "looking for a tool that",
  "searchRestrictToSubreddits": true,
  "searchSort": "new",
  "maxPosts": 25
}

Search restricted to your subreddit list — perfect for finding buyer-intent posts in a specific community.

Export subreddit rules + metadata

{
  "subreddits": ["python", "rust", "golang"],
  "maxPosts": 0,
  "includeRules": true,
  "includeSubredditInfo": true
}

No posts, just the community ruleset + about-page data. Useful before scheduling automated submissions or compliance audits.

Output examples

Example post (browse / search mode)

{
  "type": "post",
  "subreddit": "MachineLearning",
  "id": "1abc234",
  "title": "[D] What papers stood out this week?",
  "author": "researcher_42",
  "score": 1247,
  "upvoteRatio": 0.96,
  "numComments": 183,
  "createdAt": "2026-05-12T08:14:22.000Z",
  "url": "https://arxiv.org/abs/2401.12345",
  "permalink": "https://www.reddit.com/r/MachineLearning/comments/1abc234/...",
  "selftext": "I've been reading...",
  "flair": "Discussion",
  "over18": false,
  "domain": "arxiv.org",
  "isRemoved": false,
  "isDeleted": false,
  "removedBy": null
}

Example post (SaaS idea-mining mode)

{
  "type": "post",
  "subreddit": "SaaS",
  "title": "Looking for a tool that helps me track competitor pricing",
  "author": "indiebuilder",
  "score": 47,
  "numComments": 23,
  "intentTier": "request",
  "painPhrase": "looking for a tool",
  "demandScore": 64.3,
  "permalink": "https://www.reddit.com/r/SaaS/comments/...",
  "createdAt": "2026-05-15T11:02:00.000Z"
}

Example post (mention monitor mode)

{
  "type": "post",
  "subreddit": "productivity",
  "title": "Switched from Notion to Obsidian, my honest experience",
  "matchedTerm": "Obsidian",
  "sentiment": "positive",
  "score": 312,
  "numComments": 89,
  "permalink": "https://www.reddit.com/r/productivity/comments/..."
}

Example comment (nested)

{
  "type": "comment",
  "subreddit": "MachineLearning",
  "postId": "1abc234",
  "postTitle": "[D] What papers stood out this week?",
  "commentId": "j12abcd",
  "author": "ml_researcher",
  "body": "The Mamba paper completely changed how I think about state-space models...",
  "score": 156,
  "depth": 1,
  "replyCount": 3,
  "isSubmitter": false,
  "parentId": "t1_j11xyz",
  "createdAt": "2026-05-12T09:45:00.000Z",
  "isRemoved": false
}

Example subreddit rules

{
  "type": "rules",
  "subreddit": "MachineLearning",
  "fetchedAt": "2026-05-16T13:27:53.651Z",
  "rulesCount": 6,
  "rules": [
    {
      "kind": "link",
      "shortName": "Be respectful",
      "description": "No personal attacks, hate speech, or harassment.",
      "violationReason": "Disrespectful behavior",
      "priority": 0,
      "createdAt": "2022-05-19T17:26:33.000Z"
    }
  ],
  "siteRules": ["Spam", "Personal and confidential information"]
}

Example subreddit metadata

{
  "type": "subreddit",
  "subreddit": "Python",
  "displayName": "Python",
  "subscribers": 1479405,
  "activeUserCount": 4203,
  "publicDescription": "News about the Python programming language.",
  "isNsfw": false,
  "lang": "en",
  "submissionType": "self",
  "allowVideos": true,
  "allowImages": true,
  "allowPolls": false,
  "bannerImg": "https://styles.redditmedia.com/...",
  "iconImg": "https://styles.redditmedia.com/...",
  "primaryColor": "#3776ab",
  "createdAt": "2008-01-25T03:15:11.000Z",
  "url": "https://www.reddit.com/r/Python/"
}

Input parameters

Parameter	Type	Default	Description
`subreddits`	string[]	`[]`	Subreddit names without `r/`
`sort`	enum	`hot`	`hot`, `new`, `top`, `rising`
`timeRange`	enum	`day`	For `sort=top`: `hour`, `day`, `week`, `month`, `year`, `all`
`maxPosts`	int	`25`	Per subreddit / per search / per term (1–1000)
`includeComments`	bool	`false`	Fetch comments for each post
`maxCommentsPerPost`	int	`50`	Cap on comments per post (across all depths)
`commentDepth`	int	`3`	Max reply nesting depth (1=top-level only, up to 10)
`minCommentScore`	int	`-1000`	Skip comments below this score
`includeRules`	bool	`false`	Export each subreddit's full ruleset
`includeSubredditInfo`	bool	`false`	Export each subreddit's about page
`searchQuery`	string	—	Run in search mode
`searchSort`	enum	`relevance`	`relevance`, `new`, `top`, `comments`, `hot`
`searchRestrictToSubreddits`	bool	`false`	Search inside `subreddits` only
`excludeRemoved`	bool	`false`	Skip removed/deleted posts and comments
`nsfwFilter`	enum	`include`	`include`, `exclude`, `only`
`ideaMiningMode`	bool	`false`	Activate SaaS idea-mining mode
`painPhrasePack`	enum	`saas`	`saas`, `leads`, `feature-request`, `custom`
`customPainPhrases`	string[]	—	Used when `painPhrasePack=custom`
`users`	string[]	—	Reddit usernames to scrape
`userContentType`	enum	`both`	`posts`, `comments`, `both`
`maxItemsPerUser`	int	`25`	Cap per user (1–1000)
`mentionMonitorMode`	bool	`false`	Activate mention monitor mode
`mentionTerms`	string[]	—	Keywords to track
`sentimentClassification`	bool	`false`	Tag posts with positive/negative/neutral/mixed
`proxyConfiguration`	object	residential	Apify Proxy config — residential recommended

Dataset views in Apify Console

Open the dataset tab in the Apify Console and switch between these column presets without rerunning:

Overview — type, subreddit, title, author, score, created, permalink (mixed)
Posts — title, author, score, upvote%, comments, flair, NSFW, created, permalink
Comments — postTitle, depth, author, body, score, replies, OP flag
Ideas — title, intent tier, matched phrase, demand score, comments, link
Mentions — matched term, sentiment, subreddit, title, score, permalink
Subreddit info — displayName, subscribers, active, NSFW, language, description
Subreddit rules — subreddit, rulesCount, rules array, site-wide rules

Views filter columns, not rows. Sort by the type column to group post / comment / rules / subreddit rows in any view.

Use cases

Find your next SaaS idea

Use ideaMiningMode with painPhrasePack: "saas" on r/SaaS, r/Entrepreneur, r/SideProject, r/IndieHackers. Sort the dataset by demandScore descending. The top 10 rows are validated pain points with real engagement — each one is a potential product.

Monitor brand and competitor mentions

Schedule a daily run with mentionMonitorMode + your brand + 3 competitor names + sentimentClassification. Pipe results to Slack via Apify integrations. You'll know within 24h when a competitor takes a reputation hit.

Lead generation

Use painPhrasePack: "leads" (or custom phrases like "looking for an agency", "willing to pay") to find buyer-intent posts in your category. Filter to intentTier: "buying" for the warmest leads.

AI / RAG training data

Combine searchQuery with includeComments: true and high commentDepth to assemble topic-specific datasets. Each row already has structured metadata (subreddit, score, depth, parent) ready for vector embeddings.

Compliance & community moderation audits

Use includeRules: true and includeSubredditInfo: true to export the full posting guidelines + submission types for every subreddit in a list. Useful before automated outreach campaigns or content syndication.

Influencer & user research

Use users: ["username1", "username2"] to pull complete histories. Combine with sentiment classification for brand advocacy mapping.

Tips for scaling and Reddit's limits

Reddit's 1,000-item limit

Reddit's public JSON caps any single listing (subreddit posts, search results, user history) at ~1,000 items. To get more:

Combine sort modes — Hot + New + Top often returns different posts
Use time-range slicing — sort=top + timeRange=hour then day then week covers different windows
Search with multiple keywords — each query has its own 1,000 ceiling
Schedule incremental runs — scrape recent posts daily and deduplicate by post id in your downstream pipeline

The 1,000 cap does not apply to comments — you can pull complete comment trees from any single post.

Cost optimization

Set excludeRemoved: true to skip dead rows (saves per-event charges)
Set minCommentScore: 1 to skip downvoted noise
Lower commentDepth to 1–2 if you only care about top-level discussion
Use nsfwFilter: "exclude" for B2B / safe-content use cases

Integration

API — run from any code, with runActor and webhook callbacks
Schedule — Apify's native scheduler runs this on cron
Webhooks — push completed runs to your stack instantly
Integrations — Apify connects to Google Sheets, Airtable, Slack, Discord, Make, Zapier, and more

FAQ

Is scraping Reddit legal?

Scraping public Reddit data is generally permissible — Reddit's content is publicly accessible. This actor only fetches public JSON endpoints that anyone with a browser can view. We do not bypass private subreddits, modmail, or any auth-gated content. You are responsible for complying with Reddit's Terms of Service and your local data-protection laws (e.g., handling personal data under GDPR).

Do I need a Reddit API key or developer account?

No. This actor uses Reddit's public JSON endpoints (the same data the website renders), so no OAuth, no app registration, no rate-limit token. Especially useful after Reddit's 2023 API pricing changes pushed the official API out of reach for most use cases.

How does this compare to other Reddit scrapers on the marketplace?

Per-content-type pricing — comments cost $0.0005 here vs flat per-row pricing elsewhere; if your use case is comment-heavy, you'll pay roughly 6× less
Nested comment trees with depth and score filters, not just top-level
Two purpose-built modes (idea-mining, mention monitor) you won't find on generic scrapers
Removed-content detection so you can filter or audit moderator removals
Seven dataset views instead of one flat output table

Can it scrape private or quarantined subreddits?

No. Private subreddits require Reddit login + invite. Quarantined subreddits require a logged-in opt-in click. This actor only accesses publicly visible data.

How fresh is the data?

Real-time at scrape time. The actor hits Reddit's live JSON; there's no cache or staging.

Can I scrape more than 1,000 posts from a subreddit?

Not in a single listing — that's Reddit's platform limit. See the "Tips for scaling" section above for workarounds (sort combinations, time slicing, incremental runs).

What about rate limits?

The actor uses Apify's residential proxy rotation by default and randomizes user agents. With default settings you can scrape thousands of posts per run reliably. If you hit transient errors, lower maxConcurrency in the code or your input.

Can I get notified when a new mention shows up?

Yes — schedule the actor with mentionMonitorMode to run hourly or daily, then add a webhook in the actor's integrations tab. Send the run-finished event to Slack, Zapier, or your own endpoint and react instantly.

Does it work for sentiment analysis at scale?

The built-in sentimentClassification is a fast keyword-heuristic suitable for filtering and dashboards. For high-stakes sentiment grading, pipe the raw body / selftext to an LLM downstream — the actor outputs already include structured metadata (subreddit, score, depth, intent tier) that improves LLM accuracy.

How do I export to CSV / Excel?

After a run, open the dataset tab in the Apify Console and click Export. CSV, Excel, JSON, JSONL, XML, RSS, and HTML are all supported natively by Apify.

What output schema should I expect?

Every row has a type discriminator: post, comment, rules, or subreddit. The full field schema is documented in the actor's output schema tab in the Apify Console — or see the JSON examples above.

Can I run this as an API?

Yes. Apify exposes every actor as an API endpoint — call POST /v2/acts/{actorId}/runs with your input as JSON, then poll or webhook for the dataset URL. No SDK required.

Changelog

v0.10 — Added subreddit metadata (includeSubredditInfo), mention monitor mode (mentionMonitorMode, mentionTerms, sentimentClassification)
v0.8 — User profile scraping (users, userContentType, maxItemsPerUser)
v0.7 — SaaS idea-mining mode (ideaMiningMode, painPhrasePack, demand scoring, intent tiers)
v0.6 — NSFW filter (nsfwFilter)
v0.5 — Removed / deleted post + comment detection (isRemoved, isDeleted, removedBy, excludeRemoved)
v0.4 — Nested comment tree with depth + min-score filters (commentDepth, minCommentScore)
v0.3 — Reddit-wide search (searchQuery, searchSort, searchRestrictToSubreddits)
v0.2 — Subreddit rules export (includeRules)
v0.1 — Initial release: posts, top-level comments, bulk subreddit scraping

Support

Open an issue on this actor's Issues tab in the Apify Console
Issue response target: under 24 hours
For custom modifications or higher-volume contracts, contact via Apify Console messaging

Reddit Search Scraper — Posts, Comments & Users

logiover/reddit-search-scraper

Scrape Reddit subreddit search with no API key or login. Export posts and comments to CSV/JSON — a Reddit API alternative for keyword monitoring.

Logiover

Reddit Scraper — Posts & Comments

signalengine/reddit-scraper

Scrape posts and comments from any subreddit — no Reddit API key, no login, no proxy. A fast, free Reddit API alternative for public data, exported to JSON, CSV or Excel.