Reddit Pulse avatar

Reddit Pulse

Pricing

from $1.50 / 1,000 post scrapeds

Go to Apify Store
Reddit Pulse

Reddit Pulse

Extract Reddit posts, comments, users, and subreddit metadata at scale — and turn them into structured signals. Find your next SaaS idea, monitor brand mentions, mine pain points, or feed an AI / RAG pipeline. No Reddit API key, no login, no developer registration.

Pricing

from $1.50 / 1,000 post scrapeds

Rating

0.0

(0)

Developer

Henil Mehta

Henil Mehta

Maintained by Community

Actor stats

1

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

Reddit Scraper — SaaS Idea Finder, Brand Monitor & Subreddit Data Extractor

Extract Reddit posts, comments, users, and subreddit metadata at scale — and turn them into structured signals. Find your next SaaS idea, monitor brand mentions, mine pain points, or feed an AI / RAG pipeline. No Reddit API key, no login, no developer registration.

Built for buyers who want answers, not just rows. Most Reddit scrapers give you raw data. This one ships with two purpose-built modes — SaaS idea-mining and brand mention monitoring — that pre-filter, classify, and score posts so you can act on them in minutes, not hours.


What does this Reddit Scraper do?

This actor crawls Reddit's public JSON endpoints (the same data Reddit's official API exposes, without the OAuth tax) and writes structured rows to an Apify dataset. It scrapes:

  • Posts — title, score, upvote ratio, comments count, flair, NSFW flag, thumbnail, full body, permalink
  • Comments — full body, score, author, depth, reply count, parent relationship, OP flag
  • Users — every post and comment by a username, with metadata
  • Subreddit rules — the full ruleset of any community (kind, description, violation reason, priority)
  • Subreddit metadata — subscribers, active users, description, banner, icon, creation date, NSFW flag, submission type
  • Search results — keyword search across all of Reddit, or restricted to a subreddit list

It runs in five modes, switched by a single input field:

ModeWhat it doesWhen to use it
BrowsePulls posts from a subreddit by sort (Hot / New / Top / Rising)Generic monitoring, content audits
SearchReddit-wide keyword search with relevance / new / top / comments sortFinding posts about a topic anywhere
SaaS idea-miningFilters posts to those matching pain phrases ("I wish there was…"), scores demand, classifies intentIndie hackers, product validation
Mention / brand monitorTracks multiple keywords side-by-side with sentiment classificationMarketing teams, competitor tracking
User profileScrapes posts + comments of any usernameInfluencer audits, lead research

All modes write to the same dataset, with named views (Overview, Posts, Comments, Ideas, Mentions, Subreddit info, Subreddit rules) so you can switch column presets in the Apify Console without rerunning.


Who is this Reddit Scraper for?

  • Indie hackers & founders — find SaaS ideas with real demand signals (intent tier + demand score)
  • Marketing & growth teams — monitor brand and competitor mentions across all of Reddit, with sentiment
  • AI / ML engineers — assemble RAG and training datasets with structured intent + topic metadata
  • Sales teams — find leads asking for tools in your category (buying-intent filter)
  • Researchers & journalists — bulk-export discussions, comment trees, and subreddit rules for analysis
  • Content creators — surface trending questions and content gaps in your niche

You don't need to write code. You don't need a Reddit developer account. You don't need to manage proxies. Paste an input JSON, click run.


What data can I extract from Reddit?

FieldPostsCommentsUsersCommunities
ID, title, body✅ (body)✅ (description)
Author, score, upvote ration/an/a
Comment count, num repliesn/an/a
Creation timestamp
Permalink, URL
Flair, NSFW flag, thumbnailn/an/an/a
Reply depth, parent IDn/an/an/a
Removed / deleted detectionn/an/a
Subscriber count, active usersn/an/an/a
Banner, icon, primary colorn/an/an/a
Submission rules (any/link/self)n/an/an/a
Full rules list per communityn/an/an/a
Intent tier (idea-mining mode)n/an/an/a
Demand score (idea-mining mode)n/an/an/a
Sentiment (mention monitor)n/an/an/a

How much does it cost to scrape Reddit?

Pay only for the results you receive — no subscription, no platform tax.

EventPriceCharged when
Post scraped$0.0015Every post row written to the dataset
Comment scraped$0.0005Every comment row (including nested replies)
Subreddit ruleset scraped$0.0003Once per subreddit when includeRules is enabled
Actor start$0.00005Once per run (first 5 seconds free)

Example cost calculations

RunCost
1,000 posts, no comments$1.50
1,000 posts + 10,000 comments$6.50
100 posts + 5,000 comments + 50 subreddit rulesets$2.67
50 mention-monitor terms × 25 posts each (1,250 posts)$1.88
10 user profiles × 25 posts + 25 comments each (500 rows)$0.50

Why per-event instead of flat per-row? Comments are much cheaper than posts on our pricing, so comment-heavy workloads (sentiment, lead-gen, training data) cost roughly 6× less here than on scrapers that charge a flat $0.003+ per row.

Apify's Free plan gives you $5/month in platform credits, enough to scrape ~3,000 posts before paying anything.


How to scrape Reddit with this actor

  1. Sign in to Apify (free, no credit card)
  2. Open this actor's page and click Try for free or Start
  3. Fill in the input — at minimum, one of:
    • subreddits (a list of subreddit names) for browse mode
    • searchQuery for search mode
    • ideaMiningMode: true for SaaS idea mining (uses a curated subreddit list automatically)
    • mentionMonitorMode: true + mentionTerms for brand monitoring
    • users (a list of usernames) for user profile mode
  4. Click Save & Start
  5. Watch the run log; when complete, open the dataset tab, switch to the view you want, and export JSON / CSV / Excel

The whole flow takes under 60 seconds for a first run.


Input examples

Find SaaS ideas with high demand

{
"ideaMiningMode": true,
"painPhrasePack": "saas",
"sort": "new",
"maxPosts": 100,
"excludeRemoved": true
}

Uses the curated default subreddit list (SaaS, Entrepreneur, startups, SideProject, indiehackers, SomebodyMakeThis, AppIdeas, microsaas, smallbusiness). Outputs only posts matching pain phrases like "I wish there was…", "looking for a tool…", "would pay for…", with intent tier and demand score.

Monitor brand mentions across Reddit

{
"mentionMonitorMode": true,
"mentionTerms": ["Notion", "Coda", "Obsidian", "Roam Research"],
"sentimentClassification": true,
"searchSort": "new",
"maxPosts": 50
}

Tracks four competitors side-by-side. Each row is tagged with matchedTerm and sentiment (positive/negative/neutral/mixed). Schedule daily for ongoing competitor intelligence.

Browse a subreddit with full nested comment trees

{
"subreddits": ["MachineLearning", "datascience"],
"sort": "top",
"timeRange": "week",
"maxPosts": 50,
"includeComments": true,
"maxCommentsPerPost": 100,
"commentDepth": 4,
"minCommentScore": 1
}

Pulls 50 top posts of the week, walks reply trees up to 4 levels deep, skips downvoted comments.

Audit a Reddit user

{
"users": ["spez", "kn0thing"],
"userContentType": "both",
"maxItemsPerUser": 100
}

Pulls 100 most recent posts and 100 most recent comments from each user. Use for influencer research, account-takeover monitoring, or sales lead enrichment.

Search a specific niche

{
"subreddits": ["SaaS", "Entrepreneur", "SideProject"],
"searchQuery": "looking for a tool that",
"searchRestrictToSubreddits": true,
"searchSort": "new",
"maxPosts": 25
}

Search restricted to your subreddit list — perfect for finding buyer-intent posts in a specific community.

Export subreddit rules + metadata

{
"subreddits": ["python", "rust", "golang"],
"maxPosts": 0,
"includeRules": true,
"includeSubredditInfo": true
}

No posts, just the community ruleset + about-page data. Useful before scheduling automated submissions or compliance audits.


Output examples

Example post (browse / search mode)

{
"type": "post",
"subreddit": "MachineLearning",
"id": "1abc234",
"title": "[D] What papers stood out this week?",
"author": "researcher_42",
"score": 1247,
"upvoteRatio": 0.96,
"numComments": 183,
"createdAt": "2026-05-12T08:14:22.000Z",
"url": "https://arxiv.org/abs/2401.12345",
"permalink": "https://www.reddit.com/r/MachineLearning/comments/1abc234/...",
"selftext": "I've been reading...",
"flair": "Discussion",
"over18": false,
"domain": "arxiv.org",
"isRemoved": false,
"isDeleted": false,
"removedBy": null
}

Example post (SaaS idea-mining mode)

{
"type": "post",
"subreddit": "SaaS",
"title": "Looking for a tool that helps me track competitor pricing",
"author": "indiebuilder",
"score": 47,
"numComments": 23,
"intentTier": "request",
"painPhrase": "looking for a tool",
"demandScore": 64.3,
"permalink": "https://www.reddit.com/r/SaaS/comments/...",
"createdAt": "2026-05-15T11:02:00.000Z"
}

Example post (mention monitor mode)

{
"type": "post",
"subreddit": "productivity",
"title": "Switched from Notion to Obsidian, my honest experience",
"matchedTerm": "Obsidian",
"sentiment": "positive",
"score": 312,
"numComments": 89,
"permalink": "https://www.reddit.com/r/productivity/comments/..."
}

Example comment (nested)

{
"type": "comment",
"subreddit": "MachineLearning",
"postId": "1abc234",
"postTitle": "[D] What papers stood out this week?",
"commentId": "j12abcd",
"author": "ml_researcher",
"body": "The Mamba paper completely changed how I think about state-space models...",
"score": 156,
"depth": 1,
"replyCount": 3,
"isSubmitter": false,
"parentId": "t1_j11xyz",
"createdAt": "2026-05-12T09:45:00.000Z",
"isRemoved": false
}

Example subreddit rules

{
"type": "rules",
"subreddit": "MachineLearning",
"fetchedAt": "2026-05-16T13:27:53.651Z",
"rulesCount": 6,
"rules": [
{
"kind": "link",
"shortName": "Be respectful",
"description": "No personal attacks, hate speech, or harassment.",
"violationReason": "Disrespectful behavior",
"priority": 0,
"createdAt": "2022-05-19T17:26:33.000Z"
}
],
"siteRules": ["Spam", "Personal and confidential information"]
}

Example subreddit metadata

{
"type": "subreddit",
"subreddit": "Python",
"displayName": "Python",
"subscribers": 1479405,
"activeUserCount": 4203,
"publicDescription": "News about the Python programming language.",
"isNsfw": false,
"lang": "en",
"submissionType": "self",
"allowVideos": true,
"allowImages": true,
"allowPolls": false,
"bannerImg": "https://styles.redditmedia.com/...",
"iconImg": "https://styles.redditmedia.com/...",
"primaryColor": "#3776ab",
"createdAt": "2008-01-25T03:15:11.000Z",
"url": "https://www.reddit.com/r/Python/"
}

Input parameters

ParameterTypeDefaultDescription
subredditsstring[][]Subreddit names without r/
sortenumhothot, new, top, rising
timeRangeenumdayFor sort=top: hour, day, week, month, year, all
maxPostsint25Per subreddit / per search / per term (1–1000)
includeCommentsboolfalseFetch comments for each post
maxCommentsPerPostint50Cap on comments per post (across all depths)
commentDepthint3Max reply nesting depth (1=top-level only, up to 10)
minCommentScoreint-1000Skip comments below this score
includeRulesboolfalseExport each subreddit's full ruleset
includeSubredditInfoboolfalseExport each subreddit's about page
searchQuerystringRun in search mode
searchSortenumrelevancerelevance, new, top, comments, hot
searchRestrictToSubredditsboolfalseSearch inside subreddits only
excludeRemovedboolfalseSkip removed/deleted posts and comments
nsfwFilterenumincludeinclude, exclude, only
ideaMiningModeboolfalseActivate SaaS idea-mining mode
painPhrasePackenumsaassaas, leads, feature-request, custom
customPainPhrasesstring[]Used when painPhrasePack=custom
usersstring[]Reddit usernames to scrape
userContentTypeenumbothposts, comments, both
maxItemsPerUserint25Cap per user (1–1000)
mentionMonitorModeboolfalseActivate mention monitor mode
mentionTermsstring[]Keywords to track
sentimentClassificationboolfalseTag posts with positive/negative/neutral/mixed
proxyConfigurationobjectresidentialApify Proxy config — residential recommended

Dataset views in Apify Console

Open the dataset tab in the Apify Console and switch between these column presets without rerunning:

  • Overview — type, subreddit, title, author, score, created, permalink (mixed)
  • Posts — title, author, score, upvote%, comments, flair, NSFW, created, permalink
  • Comments — postTitle, depth, author, body, score, replies, OP flag
  • Ideas — title, intent tier, matched phrase, demand score, comments, link
  • Mentions — matched term, sentiment, subreddit, title, score, permalink
  • Subreddit info — displayName, subscribers, active, NSFW, language, description
  • Subreddit rules — subreddit, rulesCount, rules array, site-wide rules

Views filter columns, not rows. Sort by the type column to group post / comment / rules / subreddit rows in any view.


Use cases

Find your next SaaS idea

Use ideaMiningMode with painPhrasePack: "saas" on r/SaaS, r/Entrepreneur, r/SideProject, r/IndieHackers. Sort the dataset by demandScore descending. The top 10 rows are validated pain points with real engagement — each one is a potential product.

Monitor brand and competitor mentions

Schedule a daily run with mentionMonitorMode + your brand + 3 competitor names + sentimentClassification. Pipe results to Slack via Apify integrations. You'll know within 24h when a competitor takes a reputation hit.

Lead generation

Use painPhrasePack: "leads" (or custom phrases like "looking for an agency", "willing to pay") to find buyer-intent posts in your category. Filter to intentTier: "buying" for the warmest leads.

AI / RAG training data

Combine searchQuery with includeComments: true and high commentDepth to assemble topic-specific datasets. Each row already has structured metadata (subreddit, score, depth, parent) ready for vector embeddings.

Compliance & community moderation audits

Use includeRules: true and includeSubredditInfo: true to export the full posting guidelines + submission types for every subreddit in a list. Useful before automated outreach campaigns or content syndication.

Influencer & user research

Use users: ["username1", "username2"] to pull complete histories. Combine with sentiment classification for brand advocacy mapping.


Tips for scaling and Reddit's limits

Reddit's 1,000-item limit

Reddit's public JSON caps any single listing (subreddit posts, search results, user history) at ~1,000 items. To get more:

  • Combine sort modes — Hot + New + Top often returns different posts
  • Use time-range slicingsort=top + timeRange=hour then day then week covers different windows
  • Search with multiple keywords — each query has its own 1,000 ceiling
  • Schedule incremental runs — scrape recent posts daily and deduplicate by post id in your downstream pipeline

The 1,000 cap does not apply to comments — you can pull complete comment trees from any single post.

Cost optimization

  • Set excludeRemoved: true to skip dead rows (saves per-event charges)
  • Set minCommentScore: 1 to skip downvoted noise
  • Lower commentDepth to 1–2 if you only care about top-level discussion
  • Use nsfwFilter: "exclude" for B2B / safe-content use cases

Integration

  • API — run from any code, with runActor and webhook callbacks
  • Schedule — Apify's native scheduler runs this on cron
  • Webhooks — push completed runs to your stack instantly
  • Integrations — Apify connects to Google Sheets, Airtable, Slack, Discord, Make, Zapier, and more

FAQ

Scraping public Reddit data is generally permissible — Reddit's content is publicly accessible. This actor only fetches public JSON endpoints that anyone with a browser can view. We do not bypass private subreddits, modmail, or any auth-gated content. You are responsible for complying with Reddit's Terms of Service and your local data-protection laws (e.g., handling personal data under GDPR).

Do I need a Reddit API key or developer account?

No. This actor uses Reddit's public JSON endpoints (the same data the website renders), so no OAuth, no app registration, no rate-limit token. Especially useful after Reddit's 2023 API pricing changes pushed the official API out of reach for most use cases.

How does this compare to other Reddit scrapers on the marketplace?

  • Per-content-type pricing — comments cost $0.0005 here vs flat per-row pricing elsewhere; if your use case is comment-heavy, you'll pay roughly 6× less
  • Nested comment trees with depth and score filters, not just top-level
  • Two purpose-built modes (idea-mining, mention monitor) you won't find on generic scrapers
  • Removed-content detection so you can filter or audit moderator removals
  • Seven dataset views instead of one flat output table

Can it scrape private or quarantined subreddits?

No. Private subreddits require Reddit login + invite. Quarantined subreddits require a logged-in opt-in click. This actor only accesses publicly visible data.

How fresh is the data?

Real-time at scrape time. The actor hits Reddit's live JSON; there's no cache or staging.

Can I scrape more than 1,000 posts from a subreddit?

Not in a single listing — that's Reddit's platform limit. See the "Tips for scaling" section above for workarounds (sort combinations, time slicing, incremental runs).

What about rate limits?

The actor uses Apify's residential proxy rotation by default and randomizes user agents. With default settings you can scrape thousands of posts per run reliably. If you hit transient errors, lower maxConcurrency in the code or your input.

Can I get notified when a new mention shows up?

Yes — schedule the actor with mentionMonitorMode to run hourly or daily, then add a webhook in the actor's integrations tab. Send the run-finished event to Slack, Zapier, or your own endpoint and react instantly.

Does it work for sentiment analysis at scale?

The built-in sentimentClassification is a fast keyword-heuristic suitable for filtering and dashboards. For high-stakes sentiment grading, pipe the raw body / selftext to an LLM downstream — the actor outputs already include structured metadata (subreddit, score, depth, intent tier) that improves LLM accuracy.

How do I export to CSV / Excel?

After a run, open the dataset tab in the Apify Console and click Export. CSV, Excel, JSON, JSONL, XML, RSS, and HTML are all supported natively by Apify.

What output schema should I expect?

Every row has a type discriminator: post, comment, rules, or subreddit. The full field schema is documented in the actor's output schema tab in the Apify Console — or see the JSON examples above.

Can I run this as an API?

Yes. Apify exposes every actor as an API endpoint — call POST /v2/acts/{actorId}/runs with your input as JSON, then poll or webhook for the dataset URL. No SDK required.


Changelog

  • v0.10 — Added subreddit metadata (includeSubredditInfo), mention monitor mode (mentionMonitorMode, mentionTerms, sentimentClassification)
  • v0.8 — User profile scraping (users, userContentType, maxItemsPerUser)
  • v0.7 — SaaS idea-mining mode (ideaMiningMode, painPhrasePack, demand scoring, intent tiers)
  • v0.6 — NSFW filter (nsfwFilter)
  • v0.5 — Removed / deleted post + comment detection (isRemoved, isDeleted, removedBy, excludeRemoved)
  • v0.4 — Nested comment tree with depth + min-score filters (commentDepth, minCommentScore)
  • v0.3 — Reddit-wide search (searchQuery, searchSort, searchRestrictToSubreddits)
  • v0.2 — Subreddit rules export (includeRules)
  • v0.1 — Initial release: posts, top-level comments, bulk subreddit scraping

Support

  • Open an issue on this actor's Issues tab in the Apify Console
  • Issue response target: under 24 hours
  • For custom modifications or higher-volume contracts, contact via Apify Console messaging