# Reddit Pulse (`dizilus/reddit-pulse`) Actor

Extract Reddit posts, comments, users, and subreddit metadata at scale — and turn them into structured signals. Find your next SaaS idea, monitor brand mentions, mine pain points, or feed an AI / RAG pipeline. No Reddit API key, no login, no developer registration.

- **URL**: https://apify.com/dizilus/reddit-pulse.md
- **Developed by:** [Henil Mehta](https://apify.com/dizilus) (community)
- **Categories:** Social media, Lead generation, Automation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 1 bookmarks
- **User rating**: No ratings yet

## Pricing

from $1.50 / 1,000 post scrapeds

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Reddit Scraper — SaaS Idea Finder, Brand Monitor & Subreddit Data Extractor

Extract Reddit posts, comments, users, and subreddit metadata at scale — and turn them into structured signals. Find your next SaaS idea, monitor brand mentions, mine pain points, or feed an AI / RAG pipeline. No Reddit API key, no login, no developer registration.

> **Built for buyers who want answers, not just rows.** Most Reddit scrapers give you raw data. This one ships with two purpose-built modes — **SaaS idea-mining** and **brand mention monitoring** — that pre-filter, classify, and score posts so you can act on them in minutes, not hours.

---

### What does this Reddit Scraper do?

This actor crawls Reddit's public JSON endpoints (the same data Reddit's official API exposes, without the OAuth tax) and writes structured rows to an Apify dataset. It scrapes:

- **Posts** — title, score, upvote ratio, comments count, flair, NSFW flag, thumbnail, full body, permalink
- **Comments** — full body, score, author, depth, reply count, parent relationship, OP flag
- **Users** — every post and comment by a username, with metadata
- **Subreddit rules** — the full ruleset of any community (kind, description, violation reason, priority)
- **Subreddit metadata** — subscribers, active users, description, banner, icon, creation date, NSFW flag, submission type
- **Search results** — keyword search across all of Reddit, or restricted to a subreddit list

It runs in five **modes**, switched by a single input field:

| Mode | What it does | When to use it |
|---|---|---|
| **Browse** | Pulls posts from a subreddit by sort (Hot / New / Top / Rising) | Generic monitoring, content audits |
| **Search** | Reddit-wide keyword search with relevance / new / top / comments sort | Finding posts about a topic anywhere |
| **SaaS idea-mining** | Filters posts to those matching pain phrases ("I wish there was…"), scores demand, classifies intent | Indie hackers, product validation |
| **Mention / brand monitor** | Tracks multiple keywords side-by-side with sentiment classification | Marketing teams, competitor tracking |
| **User profile** | Scrapes posts + comments of any username | Influencer audits, lead research |

All modes write to the **same dataset**, with named **views** (`Overview`, `Posts`, `Comments`, `Ideas`, `Mentions`, `Subreddit info`, `Subreddit rules`) so you can switch column presets in the Apify Console without rerunning.

---

### Who is this Reddit Scraper for?

- **Indie hackers & founders** — find SaaS ideas with real demand signals (intent tier + demand score)
- **Marketing & growth teams** — monitor brand and competitor mentions across all of Reddit, with sentiment
- **AI / ML engineers** — assemble RAG and training datasets with structured intent + topic metadata
- **Sales teams** — find leads asking for tools in your category (buying-intent filter)
- **Researchers & journalists** — bulk-export discussions, comment trees, and subreddit rules for analysis
- **Content creators** — surface trending questions and content gaps in your niche

You don't need to write code. You don't need a Reddit developer account. You don't need to manage proxies. Paste an input JSON, click run.

---

### What data can I extract from Reddit?

| Field | Posts | Comments | Users | Communities |
|---|---|---|---|---|
| ID, title, body | ✅ | ✅ (body) | ✅ | ✅ (description) |
| Author, score, upvote ratio | ✅ | ✅ | n/a | n/a |
| Comment count, num replies | ✅ | ✅ | n/a | n/a |
| Creation timestamp | ✅ | ✅ | ✅ | ✅ |
| Permalink, URL | ✅ | ✅ | ✅ | ✅ |
| Flair, NSFW flag, thumbnail | ✅ | n/a | n/a | n/a |
| Reply depth, parent ID | n/a | ✅ | n/a | n/a |
| Removed / deleted detection | ✅ | ✅ | n/a | n/a |
| Subscriber count, active users | n/a | n/a | n/a | ✅ |
| Banner, icon, primary color | n/a | n/a | n/a | ✅ |
| Submission rules (any/link/self) | n/a | n/a | n/a | ✅ |
| Full rules list per community | n/a | n/a | n/a | ✅ |
| Intent tier (idea-mining mode) | ✅ | n/a | n/a | n/a |
| Demand score (idea-mining mode) | ✅ | n/a | n/a | n/a |
| Sentiment (mention monitor) | ✅ | n/a | n/a | n/a |

---

### How much does it cost to scrape Reddit?

**Pay only for the results you receive — no subscription, no platform tax.**

| Event | Price | Charged when |
|---|---|---|
| Post scraped | **$0.0015** | Every post row written to the dataset |
| Comment scraped | **$0.0005** | Every comment row (including nested replies) |
| Subreddit ruleset scraped | **$0.0003** | Once per subreddit when `includeRules` is enabled |
| Actor start | $0.00005 | Once per run (first 5 seconds free) |

#### Example cost calculations

| Run | Cost |
|---|---|
| 1,000 posts, no comments | **$1.50** |
| 1,000 posts + 10,000 comments | **$6.50** |
| 100 posts + 5,000 comments + 50 subreddit rulesets | **$2.67** |
| 50 mention-monitor terms × 25 posts each (1,250 posts) | **$1.88** |
| 10 user profiles × 25 posts + 25 comments each (500 rows) | **$0.50** |

**Why per-event instead of flat per-row?** Comments are much cheaper than posts on our pricing, so comment-heavy workloads (sentiment, lead-gen, training data) cost roughly **6× less here** than on scrapers that charge a flat $0.003+ per row.

Apify's Free plan gives you **$5/month in platform credits**, enough to scrape ~3,000 posts before paying anything.

---

### How to scrape Reddit with this actor

1. **Sign in to Apify** (free, no credit card)
2. Open this actor's page and click **Try for free** or **Start**
3. Fill in the input — at minimum, one of:
   - `subreddits` (a list of subreddit names) for **browse mode**
   - `searchQuery` for **search mode**
   - `ideaMiningMode: true` for **SaaS idea mining** (uses a curated subreddit list automatically)
   - `mentionMonitorMode: true` + `mentionTerms` for **brand monitoring**
   - `users` (a list of usernames) for **user profile mode**
4. Click **Save & Start**
5. Watch the run log; when complete, open the dataset tab, switch to the view you want, and export JSON / CSV / Excel

The whole flow takes under 60 seconds for a first run.

---

### Input examples

#### Find SaaS ideas with high demand

```json
{
  "ideaMiningMode": true,
  "painPhrasePack": "saas",
  "sort": "new",
  "maxPosts": 100,
  "excludeRemoved": true
}
````

Uses the curated default subreddit list (`SaaS`, `Entrepreneur`, `startups`, `SideProject`, `indiehackers`, `SomebodyMakeThis`, `AppIdeas`, `microsaas`, `smallbusiness`). Outputs only posts matching pain phrases like *"I wish there was…"*, *"looking for a tool…"*, *"would pay for…"*, with intent tier and demand score.

#### Monitor brand mentions across Reddit

```json
{
  "mentionMonitorMode": true,
  "mentionTerms": ["Notion", "Coda", "Obsidian", "Roam Research"],
  "sentimentClassification": true,
  "searchSort": "new",
  "maxPosts": 50
}
```

Tracks four competitors side-by-side. Each row is tagged with `matchedTerm` and `sentiment` (positive/negative/neutral/mixed). Schedule daily for ongoing competitor intelligence.

#### Browse a subreddit with full nested comment trees

```json
{
  "subreddits": ["MachineLearning", "datascience"],
  "sort": "top",
  "timeRange": "week",
  "maxPosts": 50,
  "includeComments": true,
  "maxCommentsPerPost": 100,
  "commentDepth": 4,
  "minCommentScore": 1
}
```

Pulls 50 top posts of the week, walks reply trees up to 4 levels deep, skips downvoted comments.

#### Audit a Reddit user

```json
{
  "users": ["spez", "kn0thing"],
  "userContentType": "both",
  "maxItemsPerUser": 100
}
```

Pulls 100 most recent posts and 100 most recent comments from each user. Use for influencer research, account-takeover monitoring, or sales lead enrichment.

#### Search a specific niche

```json
{
  "subreddits": ["SaaS", "Entrepreneur", "SideProject"],
  "searchQuery": "looking for a tool that",
  "searchRestrictToSubreddits": true,
  "searchSort": "new",
  "maxPosts": 25
}
```

Search restricted to your subreddit list — perfect for finding buyer-intent posts in a specific community.

#### Export subreddit rules + metadata

```json
{
  "subreddits": ["python", "rust", "golang"],
  "maxPosts": 0,
  "includeRules": true,
  "includeSubredditInfo": true
}
```

No posts, just the community ruleset + about-page data. Useful before scheduling automated submissions or compliance audits.

***

### Output examples

#### Example post (browse / search mode)

```json
{
  "type": "post",
  "subreddit": "MachineLearning",
  "id": "1abc234",
  "title": "[D] What papers stood out this week?",
  "author": "researcher_42",
  "score": 1247,
  "upvoteRatio": 0.96,
  "numComments": 183,
  "createdAt": "2026-05-12T08:14:22.000Z",
  "url": "https://arxiv.org/abs/2401.12345",
  "permalink": "https://www.reddit.com/r/MachineLearning/comments/1abc234/...",
  "selftext": "I've been reading...",
  "flair": "Discussion",
  "over18": false,
  "domain": "arxiv.org",
  "isRemoved": false,
  "isDeleted": false,
  "removedBy": null
}
```

#### Example post (SaaS idea-mining mode)

```json
{
  "type": "post",
  "subreddit": "SaaS",
  "title": "Looking for a tool that helps me track competitor pricing",
  "author": "indiebuilder",
  "score": 47,
  "numComments": 23,
  "intentTier": "request",
  "painPhrase": "looking for a tool",
  "demandScore": 64.3,
  "permalink": "https://www.reddit.com/r/SaaS/comments/...",
  "createdAt": "2026-05-15T11:02:00.000Z"
}
```

#### Example post (mention monitor mode)

```json
{
  "type": "post",
  "subreddit": "productivity",
  "title": "Switched from Notion to Obsidian, my honest experience",
  "matchedTerm": "Obsidian",
  "sentiment": "positive",
  "score": 312,
  "numComments": 89,
  "permalink": "https://www.reddit.com/r/productivity/comments/..."
}
```

#### Example comment (nested)

```json
{
  "type": "comment",
  "subreddit": "MachineLearning",
  "postId": "1abc234",
  "postTitle": "[D] What papers stood out this week?",
  "commentId": "j12abcd",
  "author": "ml_researcher",
  "body": "The Mamba paper completely changed how I think about state-space models...",
  "score": 156,
  "depth": 1,
  "replyCount": 3,
  "isSubmitter": false,
  "parentId": "t1_j11xyz",
  "createdAt": "2026-05-12T09:45:00.000Z",
  "isRemoved": false
}
```

#### Example subreddit rules

```json
{
  "type": "rules",
  "subreddit": "MachineLearning",
  "fetchedAt": "2026-05-16T13:27:53.651Z",
  "rulesCount": 6,
  "rules": [
    {
      "kind": "link",
      "shortName": "Be respectful",
      "description": "No personal attacks, hate speech, or harassment.",
      "violationReason": "Disrespectful behavior",
      "priority": 0,
      "createdAt": "2022-05-19T17:26:33.000Z"
    }
  ],
  "siteRules": ["Spam", "Personal and confidential information"]
}
```

#### Example subreddit metadata

```json
{
  "type": "subreddit",
  "subreddit": "Python",
  "displayName": "Python",
  "subscribers": 1479405,
  "activeUserCount": 4203,
  "publicDescription": "News about the Python programming language.",
  "isNsfw": false,
  "lang": "en",
  "submissionType": "self",
  "allowVideos": true,
  "allowImages": true,
  "allowPolls": false,
  "bannerImg": "https://styles.redditmedia.com/...",
  "iconImg": "https://styles.redditmedia.com/...",
  "primaryColor": "#3776ab",
  "createdAt": "2008-01-25T03:15:11.000Z",
  "url": "https://www.reddit.com/r/Python/"
}
```

***

### Input parameters

| Parameter | Type | Default | Description |
|---|---|---|---|
| `subreddits` | string\[] | `[]` | Subreddit names without `r/` |
| `sort` | enum | `hot` | `hot`, `new`, `top`, `rising` |
| `timeRange` | enum | `day` | For `sort=top`: `hour`, `day`, `week`, `month`, `year`, `all` |
| `maxPosts` | int | `25` | Per subreddit / per search / per term (1–1000) |
| `includeComments` | bool | `false` | Fetch comments for each post |
| `maxCommentsPerPost` | int | `50` | Cap on comments per post (across all depths) |
| `commentDepth` | int | `3` | Max reply nesting depth (1=top-level only, up to 10) |
| `minCommentScore` | int | `-1000` | Skip comments below this score |
| `includeRules` | bool | `false` | Export each subreddit's full ruleset |
| `includeSubredditInfo` | bool | `false` | Export each subreddit's about page |
| `searchQuery` | string | — | Run in search mode |
| `searchSort` | enum | `relevance` | `relevance`, `new`, `top`, `comments`, `hot` |
| `searchRestrictToSubreddits` | bool | `false` | Search inside `subreddits` only |
| `excludeRemoved` | bool | `false` | Skip removed/deleted posts and comments |
| `nsfwFilter` | enum | `include` | `include`, `exclude`, `only` |
| `ideaMiningMode` | bool | `false` | Activate SaaS idea-mining mode |
| `painPhrasePack` | enum | `saas` | `saas`, `leads`, `feature-request`, `custom` |
| `customPainPhrases` | string\[] | — | Used when `painPhrasePack=custom` |
| `users` | string\[] | — | Reddit usernames to scrape |
| `userContentType` | enum | `both` | `posts`, `comments`, `both` |
| `maxItemsPerUser` | int | `25` | Cap per user (1–1000) |
| `mentionMonitorMode` | bool | `false` | Activate mention monitor mode |
| `mentionTerms` | string\[] | — | Keywords to track |
| `sentimentClassification` | bool | `false` | Tag posts with positive/negative/neutral/mixed |
| `proxyConfiguration` | object | residential | Apify Proxy config — residential recommended |

***

### Dataset views in Apify Console

Open the dataset tab in the Apify Console and switch between these column presets without rerunning:

- **Overview** — type, subreddit, title, author, score, created, permalink (mixed)
- **Posts** — title, author, score, upvote%, comments, flair, NSFW, created, permalink
- **Comments** — postTitle, depth, author, body, score, replies, OP flag
- **Ideas** — title, intent tier, matched phrase, demand score, comments, link
- **Mentions** — matched term, sentiment, subreddit, title, score, permalink
- **Subreddit info** — displayName, subscribers, active, NSFW, language, description
- **Subreddit rules** — subreddit, rulesCount, rules array, site-wide rules

Views filter **columns**, not rows. Sort by the `type` column to group post / comment / rules / subreddit rows in any view.

***

### Use cases

#### Find your next SaaS idea

Use `ideaMiningMode` with `painPhrasePack: "saas"` on r/SaaS, r/Entrepreneur, r/SideProject, r/IndieHackers. Sort the dataset by `demandScore` descending. The top 10 rows are validated pain points with real engagement — each one is a potential product.

#### Monitor brand and competitor mentions

Schedule a daily run with `mentionMonitorMode` + your brand + 3 competitor names + `sentimentClassification`. Pipe results to Slack via Apify integrations. You'll know within 24h when a competitor takes a reputation hit.

#### Lead generation

Use `painPhrasePack: "leads"` (or custom phrases like "looking for an agency", "willing to pay") to find buyer-intent posts in your category. Filter to `intentTier: "buying"` for the warmest leads.

#### AI / RAG training data

Combine `searchQuery` with `includeComments: true` and high `commentDepth` to assemble topic-specific datasets. Each row already has structured metadata (subreddit, score, depth, parent) ready for vector embeddings.

#### Compliance & community moderation audits

Use `includeRules: true` and `includeSubredditInfo: true` to export the full posting guidelines + submission types for every subreddit in a list. Useful before automated outreach campaigns or content syndication.

#### Influencer & user research

Use `users: ["username1", "username2"]` to pull complete histories. Combine with sentiment classification for brand advocacy mapping.

***

### Tips for scaling and Reddit's limits

#### Reddit's 1,000-item limit

Reddit's public JSON caps **any single listing** (subreddit posts, search results, user history) at ~1,000 items. To get more:

- **Combine sort modes** — Hot + New + Top often returns different posts
- **Use time-range slicing** — `sort=top` + `timeRange=hour` then `day` then `week` covers different windows
- **Search with multiple keywords** — each query has its own 1,000 ceiling
- **Schedule incremental runs** — scrape recent posts daily and deduplicate by post `id` in your downstream pipeline

The 1,000 cap **does not** apply to comments — you can pull complete comment trees from any single post.

#### Cost optimization

- Set `excludeRemoved: true` to skip dead rows (saves per-event charges)
- Set `minCommentScore: 1` to skip downvoted noise
- Lower `commentDepth` to 1–2 if you only care about top-level discussion
- Use `nsfwFilter: "exclude"` for B2B / safe-content use cases

#### Integration

- **API** — run from any code, with `runActor` and webhook callbacks
- **Schedule** — Apify's native scheduler runs this on cron
- **Webhooks** — push completed runs to your stack instantly
- **Integrations** — Apify connects to Google Sheets, Airtable, Slack, Discord, Make, Zapier, and more

***

### FAQ

#### Is scraping Reddit legal?

Scraping public Reddit data is generally permissible — Reddit's content is publicly accessible. This actor only fetches public JSON endpoints that anyone with a browser can view. We do not bypass private subreddits, modmail, or any auth-gated content. You are responsible for complying with Reddit's Terms of Service and your local data-protection laws (e.g., handling personal data under GDPR).

#### Do I need a Reddit API key or developer account?

No. This actor uses Reddit's public JSON endpoints (the same data the website renders), so no OAuth, no app registration, no rate-limit token. Especially useful after Reddit's 2023 API pricing changes pushed the official API out of reach for most use cases.

#### How does this compare to other Reddit scrapers on the marketplace?

- **Per-content-type pricing** — comments cost $0.0005 here vs flat per-row pricing elsewhere; if your use case is comment-heavy, you'll pay roughly 6× less
- **Nested comment trees** with depth and score filters, not just top-level
- **Two purpose-built modes** (idea-mining, mention monitor) you won't find on generic scrapers
- **Removed-content detection** so you can filter or audit moderator removals
- **Seven dataset views** instead of one flat output table

#### Can it scrape private or quarantined subreddits?

No. Private subreddits require Reddit login + invite. Quarantined subreddits require a logged-in opt-in click. This actor only accesses publicly visible data.

#### How fresh is the data?

Real-time at scrape time. The actor hits Reddit's live JSON; there's no cache or staging.

#### Can I scrape more than 1,000 posts from a subreddit?

Not in a single listing — that's Reddit's platform limit. See the "Tips for scaling" section above for workarounds (sort combinations, time slicing, incremental runs).

#### What about rate limits?

The actor uses Apify's residential proxy rotation by default and randomizes user agents. With default settings you can scrape thousands of posts per run reliably. If you hit transient errors, lower `maxConcurrency` in the code or your input.

#### Can I get notified when a new mention shows up?

Yes — schedule the actor with `mentionMonitorMode` to run hourly or daily, then add a **webhook** in the actor's integrations tab. Send the run-finished event to Slack, Zapier, or your own endpoint and react instantly.

#### Does it work for sentiment analysis at scale?

The built-in `sentimentClassification` is a fast keyword-heuristic suitable for filtering and dashboards. For high-stakes sentiment grading, pipe the raw `body` / `selftext` to an LLM downstream — the actor outputs already include structured metadata (subreddit, score, depth, intent tier) that improves LLM accuracy.

#### How do I export to CSV / Excel?

After a run, open the dataset tab in the Apify Console and click **Export**. CSV, Excel, JSON, JSONL, XML, RSS, and HTML are all supported natively by Apify.

#### What output schema should I expect?

Every row has a `type` discriminator: `post`, `comment`, `rules`, or `subreddit`. The full field schema is documented in the actor's **output schema** tab in the Apify Console — or see the JSON examples above.

#### Can I run this as an API?

Yes. Apify exposes every actor as an API endpoint — call `POST /v2/acts/{actorId}/runs` with your input as JSON, then poll or webhook for the dataset URL. No SDK required.

***

### Changelog

- **v0.10** — Added subreddit metadata (`includeSubredditInfo`), mention monitor mode (`mentionMonitorMode`, `mentionTerms`, `sentimentClassification`)
- **v0.8** — User profile scraping (`users`, `userContentType`, `maxItemsPerUser`)
- **v0.7** — SaaS idea-mining mode (`ideaMiningMode`, `painPhrasePack`, demand scoring, intent tiers)
- **v0.6** — NSFW filter (`nsfwFilter`)
- **v0.5** — Removed / deleted post + comment detection (`isRemoved`, `isDeleted`, `removedBy`, `excludeRemoved`)
- **v0.4** — Nested comment tree with depth + min-score filters (`commentDepth`, `minCommentScore`)
- **v0.3** — Reddit-wide search (`searchQuery`, `searchSort`, `searchRestrictToSubreddits`)
- **v0.2** — Subreddit rules export (`includeRules`)
- **v0.1** — Initial release: posts, top-level comments, bulk subreddit scraping

***

### Support

- Open an issue on this actor's **Issues** tab in the Apify Console
- Issue response target: under 24 hours
- For custom modifications or higher-volume contracts, contact via Apify Console messaging

# Actor input Schema

## `subreddits` (type: `array`):

List of subreddit names (without r/). Example: \['python', 'webscraping']. Optional if you provide 'searchQuery'.

## `searchQuery` (type: `string`):

If set, the actor searches Reddit for this query instead of browsing subreddit sort modes. Combine with 'searchRestrictToSubreddits' to limit the search to your subreddits list.

## `searchSort` (type: `string`):

How to sort search results (only used if 'searchQuery' is set).

## `searchRestrictToSubreddits` (type: `boolean`):

If true and 'subreddits' is provided, search runs only inside those subreddits instead of all of Reddit.

## `sort` (type: `string`):

How to sort posts

## `timeRange` (type: `string`):

Time window for the 'top' sort mode. Ignored for other sort modes.

## `maxPosts` (type: `integer`):

Maximum number of posts to scrape from each subreddit

## `includeComments` (type: `boolean`):

Scrape top-level comments for each post

## `includeRules` (type: `boolean`):

Also scrape the rules of each subreddit (one row per subreddit, with the full rules list).

## `includeSubredditInfo` (type: `boolean`):

Also scrape each subreddit's about page: subscriber count, active users, description, icon, creation date, NSFW flag, submission rules, etc. One row per subreddit.

## `maxCommentsPerPost` (type: `integer`):

Maximum number of comments (across all depth levels) per post. Only used if 'Include comments' is true.

## `commentDepth` (type: `integer`):

How deep to traverse reply threads. 1 = top-level only (legacy behaviour), 2 = top-level + their direct replies, etc. Default 3 covers the bulk of meaningful discussion.

## `minCommentScore` (type: `integer`):

Skip comments with score below this threshold. Use a small positive number (e.g. 1) to filter out downvoted and removed comments.

## `excludeRemoved` (type: `boolean`):

If true, skip posts and comments that were removed by moderators or deleted by their author. Saves money on per-event charges.

## `nsfwFilter` (type: `string`):

How to handle NSFW (over-18) posts. 'include' = scrape everything (default), 'exclude' = skip NSFW posts and their comments, 'only' = scrape NSFW posts only.

## `ideaMiningMode` (type: `boolean`):

Turn the actor into a structured pain-point finder. Filters posts to those matching pain phrases (e.g. 'I wish there was', 'looking for a tool'), computes a demand score, and tags intent tier. Uses a curated subreddit list if you don't supply one.

## `painPhrasePack` (type: `string`):

Pre-built keyword pack for idea-mining mode. 'saas' = founder/builder pain phrases, 'leads' = buying-intent phrases for lead-gen, 'feature-request' = product feedback phrases, 'custom' = use your own list from 'customPainPhrases'.

## `customPainPhrases` (type: `array`):

Used when painPhrasePack='custom'. Case-insensitive substring match against post title + selftext.

## `users` (type: `array`):

List of Reddit usernames (without u/) whose posts and/or comments should be scraped. Independent of the 'subreddits' field.

## `userContentType` (type: `string`):

Which type of content to fetch for each user. Used only when 'users' is set.

## `maxItemsPerUser` (type: `integer`):

Maximum posts or comments to scrape per user (each type is capped separately if userContentType='both').

## `mentionMonitorMode` (type: `boolean`):

Track one or more keywords across all of Reddit. Useful for brand/competitor mention monitoring. Requires 'mentionTerms' to be set.

## `mentionTerms` (type: `array`):

Keywords or brand names to track. The actor runs one search per term and tags each matching post with the term that produced it.

## `sentimentClassification` (type: `boolean`):

Tag each post with a basic sentiment label (positive / negative / neutral) based on keyword heuristics. Lightweight, no LLM cost.

## `proxyConfiguration` (type: `object`):

Recommended: use Apify Proxy with residential groups

## Actor input object example

```json
{
  "subreddits": [
    "python",
    "webscraping"
  ],
  "searchSort": "relevance",
  "searchRestrictToSubreddits": false,
  "sort": "hot",
  "timeRange": "day",
  "maxPosts": 25,
  "includeComments": false,
  "includeRules": false,
  "includeSubredditInfo": false,
  "maxCommentsPerPost": 50,
  "commentDepth": 3,
  "minCommentScore": -1000,
  "excludeRemoved": false,
  "nsfwFilter": "include",
  "ideaMiningMode": false,
  "painPhrasePack": "saas",
  "customPainPhrases": [
    "I wish there was",
    "looking for a tool",
    "tired of doing"
  ],
  "userContentType": "both",
  "maxItemsPerUser": 25,
  "mentionMonitorMode": false,
  "sentimentClassification": false,
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ]
  }
}
```

# Actor output Schema

## `dataset` (type: `string`):

Full dataset of scraped posts and, optionally, comments and subreddit rules.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "subreddits": [
        "python",
        "webscraping"
    ],
    "customPainPhrases": [
        "I wish there was",
        "looking for a tool",
        "tired of doing"
    ],
    "proxyConfiguration": {
        "useApifyProxy": true,
        "apifyProxyGroups": [
            "RESIDENTIAL"
        ]
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("dizilus/reddit-pulse").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "subreddits": [
        "python",
        "webscraping",
    ],
    "customPainPhrases": [
        "I wish there was",
        "looking for a tool",
        "tired of doing",
    ],
    "proxyConfiguration": {
        "useApifyProxy": True,
        "apifyProxyGroups": ["RESIDENTIAL"],
    },
}

# Run the Actor and wait for it to finish
run = client.actor("dizilus/reddit-pulse").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "subreddits": [
    "python",
    "webscraping"
  ],
  "customPainPhrases": [
    "I wish there was",
    "looking for a tool",
    "tired of doing"
  ],
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ]
  }
}' |
apify call dizilus/reddit-pulse --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=dizilus/reddit-pulse",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Reddit Pulse",
        "description": "Extract Reddit posts, comments, users, and subreddit metadata at scale — and turn them into structured signals. Find your next SaaS idea, monitor brand mentions, mine pain points, or feed an AI / RAG pipeline. No Reddit API key, no login, no developer registration.",
        "version": "0.10",
        "x-build-id": "Iy42FC5ocMuGnMNxs"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/dizilus~reddit-pulse/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-dizilus-reddit-pulse",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/dizilus~reddit-pulse/runs": {
            "post": {
                "operationId": "runs-sync-dizilus-reddit-pulse",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/dizilus~reddit-pulse/run-sync": {
            "post": {
                "operationId": "run-sync-dizilus-reddit-pulse",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "subreddits": {
                        "title": "Subreddits to scrape",
                        "type": "array",
                        "description": "List of subreddit names (without r/). Example: ['python', 'webscraping']. Optional if you provide 'searchQuery'.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "searchQuery": {
                        "title": "Search query",
                        "type": "string",
                        "description": "If set, the actor searches Reddit for this query instead of browsing subreddit sort modes. Combine with 'searchRestrictToSubreddits' to limit the search to your subreddits list."
                    },
                    "searchSort": {
                        "title": "Search sort",
                        "enum": [
                            "relevance",
                            "new",
                            "top",
                            "comments",
                            "hot"
                        ],
                        "type": "string",
                        "description": "How to sort search results (only used if 'searchQuery' is set).",
                        "default": "relevance"
                    },
                    "searchRestrictToSubreddits": {
                        "title": "Restrict search to listed subreddits",
                        "type": "boolean",
                        "description": "If true and 'subreddits' is provided, search runs only inside those subreddits instead of all of Reddit.",
                        "default": false
                    },
                    "sort": {
                        "title": "Sort by",
                        "enum": [
                            "hot",
                            "new",
                            "top",
                            "rising"
                        ],
                        "type": "string",
                        "description": "How to sort posts",
                        "default": "hot"
                    },
                    "timeRange": {
                        "title": "Time range (only for 'top')",
                        "enum": [
                            "hour",
                            "day",
                            "week",
                            "month",
                            "year",
                            "all"
                        ],
                        "type": "string",
                        "description": "Time window for the 'top' sort mode. Ignored for other sort modes.",
                        "default": "day"
                    },
                    "maxPosts": {
                        "title": "Max posts per subreddit",
                        "minimum": 1,
                        "maximum": 1000,
                        "type": "integer",
                        "description": "Maximum number of posts to scrape from each subreddit",
                        "default": 25
                    },
                    "includeComments": {
                        "title": "Include comments",
                        "type": "boolean",
                        "description": "Scrape top-level comments for each post",
                        "default": false
                    },
                    "includeRules": {
                        "title": "Include subreddit rules",
                        "type": "boolean",
                        "description": "Also scrape the rules of each subreddit (one row per subreddit, with the full rules list).",
                        "default": false
                    },
                    "includeSubredditInfo": {
                        "title": "Include subreddit metadata",
                        "type": "boolean",
                        "description": "Also scrape each subreddit's about page: subscriber count, active users, description, icon, creation date, NSFW flag, submission rules, etc. One row per subreddit.",
                        "default": false
                    },
                    "maxCommentsPerPost": {
                        "title": "Max comments per post",
                        "minimum": 1,
                        "maximum": 500,
                        "type": "integer",
                        "description": "Maximum number of comments (across all depth levels) per post. Only used if 'Include comments' is true.",
                        "default": 50
                    },
                    "commentDepth": {
                        "title": "Comment nesting depth",
                        "minimum": 1,
                        "maximum": 10,
                        "type": "integer",
                        "description": "How deep to traverse reply threads. 1 = top-level only (legacy behaviour), 2 = top-level + their direct replies, etc. Default 3 covers the bulk of meaningful discussion.",
                        "default": 3
                    },
                    "minCommentScore": {
                        "title": "Minimum comment score",
                        "type": "integer",
                        "description": "Skip comments with score below this threshold. Use a small positive number (e.g. 1) to filter out downvoted and removed comments.",
                        "default": -1000
                    },
                    "excludeRemoved": {
                        "title": "Exclude removed / deleted content",
                        "type": "boolean",
                        "description": "If true, skip posts and comments that were removed by moderators or deleted by their author. Saves money on per-event charges.",
                        "default": false
                    },
                    "nsfwFilter": {
                        "title": "NSFW filter",
                        "enum": [
                            "include",
                            "exclude",
                            "only"
                        ],
                        "type": "string",
                        "description": "How to handle NSFW (over-18) posts. 'include' = scrape everything (default), 'exclude' = skip NSFW posts and their comments, 'only' = scrape NSFW posts only.",
                        "default": "include"
                    },
                    "ideaMiningMode": {
                        "title": "SaaS idea-mining mode",
                        "type": "boolean",
                        "description": "Turn the actor into a structured pain-point finder. Filters posts to those matching pain phrases (e.g. 'I wish there was', 'looking for a tool'), computes a demand score, and tags intent tier. Uses a curated subreddit list if you don't supply one.",
                        "default": false
                    },
                    "painPhrasePack": {
                        "title": "Pain phrase pack",
                        "enum": [
                            "saas",
                            "leads",
                            "feature-request",
                            "custom"
                        ],
                        "type": "string",
                        "description": "Pre-built keyword pack for idea-mining mode. 'saas' = founder/builder pain phrases, 'leads' = buying-intent phrases for lead-gen, 'feature-request' = product feedback phrases, 'custom' = use your own list from 'customPainPhrases'.",
                        "default": "saas"
                    },
                    "customPainPhrases": {
                        "title": "Custom pain phrases",
                        "type": "array",
                        "description": "Used when painPhrasePack='custom'. Case-insensitive substring match against post title + selftext.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "users": {
                        "title": "Users to scrape",
                        "type": "array",
                        "description": "List of Reddit usernames (without u/) whose posts and/or comments should be scraped. Independent of the 'subreddits' field.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "userContentType": {
                        "title": "User content to scrape",
                        "enum": [
                            "posts",
                            "comments",
                            "both"
                        ],
                        "type": "string",
                        "description": "Which type of content to fetch for each user. Used only when 'users' is set.",
                        "default": "both"
                    },
                    "maxItemsPerUser": {
                        "title": "Max items per user",
                        "minimum": 1,
                        "maximum": 1000,
                        "type": "integer",
                        "description": "Maximum posts or comments to scrape per user (each type is capped separately if userContentType='both').",
                        "default": 25
                    },
                    "mentionMonitorMode": {
                        "title": "Mention / brand monitor mode",
                        "type": "boolean",
                        "description": "Track one or more keywords across all of Reddit. Useful for brand/competitor mention monitoring. Requires 'mentionTerms' to be set.",
                        "default": false
                    },
                    "mentionTerms": {
                        "title": "Mention terms",
                        "type": "array",
                        "description": "Keywords or brand names to track. The actor runs one search per term and tags each matching post with the term that produced it.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "sentimentClassification": {
                        "title": "Add sentiment classification",
                        "type": "boolean",
                        "description": "Tag each post with a basic sentiment label (positive / negative / neutral) based on keyword heuristics. Lightweight, no LLM cost.",
                        "default": false
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Recommended: use Apify Proxy with residential groups"
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
