# Reddit Scraper Pro - Posts, Comments, Sentiment (`nexcrawl/reddit-scraper-pro`) Actor

Scrape Reddit posts, comments, communities and users without login. Adds sentiment analysis, trend velocity scoring, and keyword-based search alerts not found in other Reddit scrapers.

- **URL**: https://apify.com/nexcrawl/reddit-scraper-pro.md
- **Developed by:** [Next Crawl](https://apify.com/nexcrawl) (community)
- **Categories:** Lead generation
- **Stats:** 2 total users, 1 monthly users, 0.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Reddit Scraper Pro

Scrapes Reddit posts, comments, communities and user profiles using Reddit's official API — using free, no-cost OAuth credentials instead of scraping anonymous `.json` endpoints, which Reddit blocks aggressively from shared/datacenter IPs (including Apify's free-tier infrastructure).

### Important: get free Reddit API credentials (2 minutes, fixes 403 blocks)

Without credentials, this actor falls back to anonymous requests to `www.reddit.com/*.json`, which **will get blocked with 403 errors** on Apify's free plan, since you have no proxy and Reddit blocks shared datacenter IPs on sight. This isn't a bug in the actor — it's how Reddit treats all anonymous scraper traffic from common cloud IP ranges.

The fix is free and takes about 2 minutes:

1. Go to **https://www.reddit.com/prefs/apps**
2. Click **"create app"** (or "create another app") at the bottom
3. Fill in:
   - **name**: anything, e.g. "my-scraper"
   - **type**: select **script**
   - **redirect uri**: `http://localhost:8080` (required field, not actually used)
4. Click **create app**
5. You'll see two values:
   - **Client ID** — the string under the app name (looks like `Ax3kf9_LplmnO`)
   - **Client Secret** — labeled "secret"
6. Paste both into this actor's input fields: `redditClientId` and `redditClientSecret`

That's it — once these are set, the actor authenticates with Reddit's official API (`oauth.reddit.com`) instead of scraping the public site, and the 403 blocks go away completely. This is also free for normal scraping volumes (Reddit's API rate limit is generous for a single script app).

### What's new compared to standard Reddit scrapers

#### 1. Keyword search / brand monitoring mode
Instead of only scraping URLs you already have, give it keywords and it searches Reddit (or a specific subreddit) for matching posts. Useful for:
- Brand or product mention monitoring
- Market research ("what are people saying about X")
- Lead generation (find people asking for recommendations in your niche)

#### 2. Built-in sentiment analysis
Every post and comment gets scored as `positive`, `negative`, or `neutral` using lexicon-based sentiment scoring — no external API, no cost, runs entirely inside the actor. Useful for quickly spotting community sentiment shifts without reading every comment.

#### 3. Trend velocity score
Calculates **upvotes per hour** for each post — this surfaces content that's rapidly gaining traction right now, not just old posts that accumulated a high score over time. Much better signal for "what's happening right now" than raw score alone.

### What it scrapes

| Record type | Fields |
|---|---|
| `post` | title, author, subreddit, selftext, score, upvote_ratio, num_comments, sentiment, trend score, media |
| `comment` | body, author, score, depth, sentiment |
| `user` | karma breakdown, account age, mod/gold status |
| `community` | subscriber count, active users, description |

### Input example

```json
{
  "startUrls": ["https://www.reddit.com/r/shopify/"],
  "searchKeywords": [
    { "keyword": "best shopify app for", "subreddit": "shopify" }
  ],
  "sortBy": "top",
  "timeframe": "week",
  "maxPostsPerSource": 50,
  "enableSentiment": true,
  "enableTrendScore": true
}
````

### Notes

- No login, no cookies, no API key — uses Reddit's public JSON API
- Respect Reddit's rate limits; the actor uses moderate concurrency by default
- Sentiment analysis is lexicon-based (fast, free, no API calls) — not as nuanced as an LLM, but useful for quick directional signal at scale

# Actor input Schema

## `startUrls` (type: `array`):

Reddit URLs to scrape: post links, subreddit links (r/...), or user profile links (u/...). Leave empty if using Search keywords instead.

## `searchKeywords` (type: `array`):

Search Reddit (or a specific subreddit) for posts matching these keywords. Great for market research, brand monitoring, or lead generation. Leave empty if using Start URLs instead.

## `redditClientId` (type: `string`):

Free Reddit API credential that removes 403 blocks entirely. Get one in 2 minutes: go to reddit.com/prefs/apps, click 'create app', choose type 'script', then copy the ID shown under the app name. Leave blank to run anonymously (much slower, often blocked on shared IPs).

## `redditClientSecret` (type: `string`):

The 'secret' value shown on the same Reddit app page as your Client ID.

## `sortBy` (type: `string`):

How to sort posts when scraping a community or running a keyword search.

## `timeframe` (type: `string`):

Time window to use when Sort by is set to Top.

## `maxPostsPerSource` (type: `integer`):

Maximum number of posts to scrape per community, user, or keyword search.

## `maxCommentsPerPost` (type: `integer`):

Maximum number of comments to scrape per individual post.

## `skipComments` (type: `boolean`):

If enabled, comments will not be scraped from posts — only post metadata.

## `skipUserPosts` (type: `boolean`):

If enabled, only user profile info is scraped, not their submitted posts.

## `skipCommunityInfo` (type: `boolean`):

If enabled, subreddit metadata (subscribers, description) will not be scraped.

## `includeMediaAndScores` (type: `boolean`):

If enabled, includes score, upvote ratio, and media URLs in the output.

## `enableSentiment` (type: `boolean`):

Scores each post and comment as positive, negative, or neutral using lexicon-based sentiment analysis. Useful for brand monitoring and community mood tracking.

## `enableTrendScore` (type: `boolean`):

Calculates upvotes-per-hour for each post, helping you spot rapidly rising content rather than just high-score content.

## `proxyConfiguration` (type: `object`):

Optional. Leave empty if you're on the Apify free plan with no proxy access — the actor will run direct with slow, careful pacing instead. If you do have Apify Proxy or your own proxy, add it here for faster, more reliable scraping.

## Actor input object example

```json
{
  "startUrls": [],
  "searchKeywords": [],
  "sortBy": "new",
  "timeframe": "week",
  "maxPostsPerSource": 50,
  "maxCommentsPerPost": 50,
  "skipComments": false,
  "skipUserPosts": false,
  "skipCommunityInfo": false,
  "includeMediaAndScores": true,
  "enableSentiment": true,
  "enableTrendScore": true
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [],
    "searchKeywords": []
};

// Run the Actor and wait for it to finish
const run = await client.actor("nexcrawl/reddit-scraper-pro").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "startUrls": [],
    "searchKeywords": [],
}

# Run the Actor and wait for it to finish
run = client.actor("nexcrawl/reddit-scraper-pro").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [],
  "searchKeywords": []
}' |
apify call nexcrawl/reddit-scraper-pro --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=nexcrawl/reddit-scraper-pro",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Reddit Scraper Pro - Posts, Comments, Sentiment",
        "description": "Scrape Reddit posts, comments, communities and users without login. Adds sentiment analysis, trend velocity scoring, and keyword-based search alerts not found in other Reddit scrapers.",
        "version": "0.1",
        "x-build-id": "kuOKnsR0pzQI5Wfwb"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/nexcrawl~reddit-scraper-pro/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-nexcrawl-reddit-scraper-pro",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/nexcrawl~reddit-scraper-pro/runs": {
            "post": {
                "operationId": "runs-sync-nexcrawl-reddit-scraper-pro",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/nexcrawl~reddit-scraper-pro/run-sync": {
            "post": {
                "operationId": "run-sync-nexcrawl-reddit-scraper-pro",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "startUrls": {
                        "title": "Start URLs",
                        "type": "array",
                        "description": "Reddit URLs to scrape: post links, subreddit links (r/...), or user profile links (u/...). Leave empty if using Search keywords instead.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "searchKeywords": {
                        "title": "Search keywords (new)",
                        "type": "array",
                        "description": "Search Reddit (or a specific subreddit) for posts matching these keywords. Great for market research, brand monitoring, or lead generation. Leave empty if using Start URLs instead.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "redditClientId": {
                        "title": "Reddit API Client ID (recommended)",
                        "type": "string",
                        "description": "Free Reddit API credential that removes 403 blocks entirely. Get one in 2 minutes: go to reddit.com/prefs/apps, click 'create app', choose type 'script', then copy the ID shown under the app name. Leave blank to run anonymously (much slower, often blocked on shared IPs)."
                    },
                    "redditClientSecret": {
                        "title": "Reddit API Client Secret (recommended)",
                        "type": "string",
                        "description": "The 'secret' value shown on the same Reddit app page as your Client ID."
                    },
                    "sortBy": {
                        "title": "Sort by",
                        "enum": [
                            "new",
                            "hot",
                            "top",
                            "rising"
                        ],
                        "type": "string",
                        "description": "How to sort posts when scraping a community or running a keyword search.",
                        "default": "new"
                    },
                    "timeframe": {
                        "title": "Timeframe (for Top sort)",
                        "enum": [
                            "hour",
                            "day",
                            "week",
                            "month",
                            "year",
                            "all"
                        ],
                        "type": "string",
                        "description": "Time window to use when Sort by is set to Top.",
                        "default": "week"
                    },
                    "maxPostsPerSource": {
                        "title": "Max posts per source",
                        "minimum": 1,
                        "maximum": 500,
                        "type": "integer",
                        "description": "Maximum number of posts to scrape per community, user, or keyword search.",
                        "default": 50
                    },
                    "maxCommentsPerPost": {
                        "title": "Max comments per post",
                        "minimum": 0,
                        "maximum": 500,
                        "type": "integer",
                        "description": "Maximum number of comments to scrape per individual post.",
                        "default": 50
                    },
                    "skipComments": {
                        "title": "Skip comments",
                        "type": "boolean",
                        "description": "If enabled, comments will not be scraped from posts — only post metadata.",
                        "default": false
                    },
                    "skipUserPosts": {
                        "title": "Skip user posts",
                        "type": "boolean",
                        "description": "If enabled, only user profile info is scraped, not their submitted posts.",
                        "default": false
                    },
                    "skipCommunityInfo": {
                        "title": "Skip community info",
                        "type": "boolean",
                        "description": "If enabled, subreddit metadata (subscribers, description) will not be scraped.",
                        "default": false
                    },
                    "includeMediaAndScores": {
                        "title": "Include media links, upvotes and comment count",
                        "type": "boolean",
                        "description": "If enabled, includes score, upvote ratio, and media URLs in the output.",
                        "default": true
                    },
                    "enableSentiment": {
                        "title": "Enable sentiment analysis (new)",
                        "type": "boolean",
                        "description": "Scores each post and comment as positive, negative, or neutral using lexicon-based sentiment analysis. Useful for brand monitoring and community mood tracking.",
                        "default": true
                    },
                    "enableTrendScore": {
                        "title": "Enable trend velocity score (new)",
                        "type": "boolean",
                        "description": "Calculates upvotes-per-hour for each post, helping you spot rapidly rising content rather than just high-score content.",
                        "default": true
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration (optional)",
                        "type": "object",
                        "description": "Optional. Leave empty if you're on the Apify free plan with no proxy access — the actor will run direct with slow, careful pacing instead. If you do have Apify Proxy or your own proxy, add it here for faster, more reliable scraping."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
