# Reddit Keywords Pro (`crawlerbros/reddit-keywords-pro`) Actor

Search Reddit by keywords with advanced filters like subredditFilter, subredditBlocklist, dateFrom/dateTo, minScore, maxAgeDays, excludeNsfw, authorBlocklist, keywordRequireAll.

- **URL**: https://apify.com/crawlerbros/reddit-keywords-pro.md
- **Developed by:** [Crawler Bros](https://apify.com/crawlerbros) (community)
- **Categories:** Social media, Lead generation, Developer tools
- **Stats:** 1 total users, 0 monthly users, 100.0% runs succeeded, 13 bookmarks
- **User rating**: 5.00 out of 5 stars

## Pricing

from $1.00 / 1,000 results

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Reddit Keywords Scraper

An **Apify Actor** built with **Python + Playwright** to automatically **search and scrape Reddit posts by specific keywords**.  
Perfect for **market research**, **trend analysis**, and **content discovery** — all without relying on the Reddit API.

### Features

- 🔍 **Search by Keywords:** Find Reddit posts matching specific words or phrases
- 🧩 **Multiple Keywords:** Search multiple keywords in a single run
- 📊 **Detailed Post Data:** Extract post titles, authors, scores, timestamps, flairs, and more
- ⚙️ **Sorting Options:** Sort by `relevance`, `hot`, `top`, `new`, or `comments`
- 📈 **Result Limit Control:** Fetch up to 1000 posts per keyword
- 🌐 **Browser-Based Crawling:** Avoid API limits using Playwright automation
- 💾 **Structured JSON Output:** Get clean, ready-to-use JSON data
- 🔄 **Pagination Support:** Automatically loads additional search results

### 🧠 Use Cases

Use the **Reddit Keyword Scraper** to:

- 📢 **Perform Market Research** — Track discussions around your product, brand, or niche
- 💬 **Conduct Sentiment Analysis** — Collect Reddit data for NLP or public opinion analysis
- 📰 **Discover Trending Content** — Identify viral or high-performing posts
- 📈 **Monitor Trends** — Follow emerging topics across subreddits
- 🕵️ **Gather Competitive Intelligence** — Find mentions of competitors or products
- 🎓 **Support Academic Research** — Build Reddit-based datasets for studies

### Input Parameters

The actor accepts the following input parameters:

| Parameter     | Type    | Required | Default     | Description                                               |
| ------------- | ------- | -------- | ----------- | --------------------------------------------------------- |
| `keywords`    | array   | Yes      | -           | List of keywords to search for on Reddit                  |
| `resultLimit` | integer | No       | `25`        | Maximum number of results to return per keyword (1-1000)  |
| `sort`        | string  | No       | `relevance` | Sort method: `relevance`, `hot`, `top`, `new`, `comments` |

#### Example Input

```json
{
  "keywords": ["python programming", "machine learning", "web scraping"],
  "resultLimit": 50,
  "sort": "top"
}
````

#### Sorting Options

- **`relevance`** - Most relevant to the search query (default)
- **`hot`** - Currently trending/hot posts
- **`top`** - Top-scoring posts of all time
- **`new`** - Newest posts first
- **`comments`** - Posts with the most comments

### 📤 Output Data Fields

Each Reddit post result includes:

| Field             | Description                                            |
| ----------------- | ------------------------------------------------------ |
| `keyword`         | The searched keyword that matched this post            |
| `post_id`         | Unique Reddit post ID (e.g., `1ntdzya`)                |
| `post_name`       | Full Reddit post identifier (e.g., `t3_1ntdzya`)       |
| `title`           | Post title                                             |
| `author`          | Username of the author (`[deleted]` if removed)        |
| `subreddit`       | Subreddit name                                         |
| `content`         | Text content for self/text posts (null for link posts) |
| `score`           | Post score/upvotes                                     |
| `num_comments`    | Number of comments                                     |
| `url`             | Direct link to the post                                |
| `old_reddit_url`  | Link to the old Reddit version of the post             |
| `thumbnail_image` | Thumbnail image URL (if available)                     |
| `link_flair`      | Post flair text                                        |
| `created_utc`     | Unix timestamp when the post was created               |
| `created_at`      | ISO 8601 formatted creation date/time                  |
| `is_stickied`     | Boolean indicating if the post is pinned               |
| `is_nsfw`         | Boolean indicating if the post is marked NSFW          |

#### 🧾 Example Output

```json
{
  "keyword": "putin",
  "post_id": "1ntdzya",
  "post_name": "t3_1ntdzya",
  "title": "All hopes of reasoning with Putin are gone - war is coming",
  "author": "theipaper",
  "subreddit": "r/geopolitics",
  "content": null,
  "score": 743,
  "num_comments": 354,
  "url": "https://reddit.com/r/geopolitics/comments/1ntdzya/all_hopes_of_reasoning_with_putin_are_gone_war_is/",
  "old_reddit_url": "https://old.reddit.com/r/geopolitics/comments/1ntdzya/all_hopes_of_reasoning_with_putin_are_gone_war_is/",
  "thumbnail_image": "https://external-preview.redd.it/gBVLJovs7dZbfvD14Q2tZiSlEMt_26iIluaegGWD-mM.jpeg",
  "link_flair": null,
  "created_utc": 1759140117,
  "created_at": "2025-09-29T10:01:57+00:00",
  "is_stickied": false,
  "is_nsfw": false
}
```

### Usage

#### Local Development

1. **Install dependencies**:

   ```bash
   pip install -r requirements.txt
   playwright install chromium
   ```

2. **Set up input**:

   Create a `.actor/INPUT.json` file or use environment variables:

```json
{
  "keywords": ["python programming", "machine learning"],
  "resultLimit": 10,
  "sort": "relevance"
}
```

3. **Run locally**:

   ```bash
   apify run
   ```

#### Running on Apify Platform

1. **Create an Actor** on the [Apify Platform](https://console.apify.com/)
2. **Upload the code** or connect your Git repository
3. **Configure the input** in the Actor's input tab
4. **Run the Actor** and view results in the dataset

#### Example: Searching Multiple Topics

```json
{
  "keywords": [
    "artificial intelligence",
    "climate change",
    "space exploration",
    "cryptocurrency"
  ],
  "resultLimit": 100,
  "sort": "hot"
}
```

This will search for 100 hot posts about each topic and return a total of up to 400 posts.

### Notes

- The actor uses old.reddit.com for better reliability and performance
- Results are limited to publicly accessible posts
- Rate limiting is handled with delays between searches
- Some very old posts may have incomplete data
- The actor respects Reddit's structure and uses browser automation

### Technical Details

- **Runtime**: Python 3.12
- **Browser**: Chromium (via Playwright)
- **HTML Parser**: BeautifulSoup4
- **Framework**: Apify SDK

### Limitations

- Maximum result limit per keyword: 1000 posts
- Only returns posts (not comments) from search results
- Requires stable internet connection
- Processing time increases with result limit and number of keywords

### Troubleshooting

#### No Results Found

- Verify the keywords are spelled correctly
- Try more general or popular keywords
- Check if Reddit is accessible in your region
- Try a different sort method

#### Incomplete Data

- Some older posts may have missing fields
- Deleted posts will show "\[deleted]" as author
- Private/removed posts won't appear in results

### Related Actors

- **Reddit Comment Scraper** - Scrape comments from specific Reddit posts
- **Reddit Profile Scraper** - Scrape posts from specific user profiles
- **Reddit Scraper** - General Reddit data scraper

### Support

For issues, questions, or contributions, please open an issue in the repository.

### License

This actor is provided as-is for use on the Apify platform.

```
```

# Actor input Schema

## `keywords` (type: `array`):

List of keywords to search. Each runs as a separate Reddit search.

## `resultLimit` (type: `integer`):

Max posts per keyword (1-1000).

## `sort` (type: `string`):

How to sort the search results.

## `subredditFilter` (type: `array`):

Only emit posts from these subreddits (without `r/` prefix). Empty = all.

## `subredditBlocklist` (type: `array`):

Drop posts from these subreddits.

## `minScore` (type: `integer`):

Drop posts with score below this number.

## `maxAgeDays` (type: `integer`):

Drop posts older than N days.

## `excludeNsfw` (type: `boolean`):

Drop NSFW posts.

## `authorBlocklist` (type: `array`):

Drop posts by these usernames (case-insensitive).

## `keywordRequireAll` (type: `boolean`):

When multiple keywords given, also dedupe & only emit posts whose title/content contains ALL of them. Default: each keyword runs independently (OR).

## `dateFrom` (type: `string`):

Drop posts with `created_at` before this date (inclusive). Format: YYYY-MM-DD.

## `dateTo` (type: `string`):

Drop posts with `created_at` after this date (inclusive end-of-day). Format: YYYY-MM-DD.

## Actor input object example

```json
{
  "keywords": [
    "python programming"
  ],
  "resultLimit": 500,
  "sort": "relevance",
  "subredditFilter": [],
  "subredditBlocklist": [],
  "excludeNsfw": false,
  "authorBlocklist": [],
  "keywordRequireAll": false
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "keywords": [
        "python programming"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("crawlerbros/reddit-keywords-pro").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "keywords": ["python programming"] }

# Run the Actor and wait for it to finish
run = client.actor("crawlerbros/reddit-keywords-pro").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "keywords": [
    "python programming"
  ]
}' |
apify call crawlerbros/reddit-keywords-pro --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=crawlerbros/reddit-keywords-pro",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Reddit Keywords Pro",
        "description": "Search Reddit by keywords with advanced filters like subredditFilter, subredditBlocklist, dateFrom/dateTo, minScore, maxAgeDays, excludeNsfw, authorBlocklist, keywordRequireAll.",
        "version": "1.0",
        "x-build-id": "hWRjhzhvTKffFIGN9"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/crawlerbros~reddit-keywords-pro/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-crawlerbros-reddit-keywords-pro",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/crawlerbros~reddit-keywords-pro/runs": {
            "post": {
                "operationId": "runs-sync-crawlerbros-reddit-keywords-pro",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/crawlerbros~reddit-keywords-pro/run-sync": {
            "post": {
                "operationId": "run-sync-crawlerbros-reddit-keywords-pro",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "keywords"
                ],
                "properties": {
                    "keywords": {
                        "title": "Keywords",
                        "minItems": 1,
                        "type": "array",
                        "description": "List of keywords to search. Each runs as a separate Reddit search.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "resultLimit": {
                        "title": "Result limit per keyword",
                        "minimum": 1,
                        "maximum": 1000,
                        "type": "integer",
                        "description": "Max posts per keyword (1-1000).",
                        "default": 500
                    },
                    "sort": {
                        "title": "Sort by",
                        "enum": [
                            "relevance",
                            "hot",
                            "top",
                            "new",
                            "comments"
                        ],
                        "type": "string",
                        "description": "How to sort the search results.",
                        "default": "relevance"
                    },
                    "subredditFilter": {
                        "title": "Subreddit filter (allowlist)",
                        "type": "array",
                        "description": "Only emit posts from these subreddits (without `r/` prefix). Empty = all.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "subredditBlocklist": {
                        "title": "Subreddit blocklist",
                        "type": "array",
                        "description": "Drop posts from these subreddits.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "minScore": {
                        "title": "Min score (filter)",
                        "minimum": -10000,
                        "maximum": 10000000,
                        "type": "integer",
                        "description": "Drop posts with score below this number."
                    },
                    "maxAgeDays": {
                        "title": "Max post age in days (filter)",
                        "minimum": 1,
                        "maximum": 36500,
                        "type": "integer",
                        "description": "Drop posts older than N days."
                    },
                    "excludeNsfw": {
                        "title": "Exclude NSFW",
                        "type": "boolean",
                        "description": "Drop NSFW posts.",
                        "default": false
                    },
                    "authorBlocklist": {
                        "title": "Author blocklist",
                        "type": "array",
                        "description": "Drop posts by these usernames (case-insensitive).",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "keywordRequireAll": {
                        "title": "Require all keywords (AND)",
                        "type": "boolean",
                        "description": "When multiple keywords given, also dedupe & only emit posts whose title/content contains ALL of them. Default: each keyword runs independently (OR).",
                        "default": false
                    },
                    "dateFrom": {
                        "title": "Date from (YYYY-MM-DD)",
                        "type": "string",
                        "description": "Drop posts with `created_at` before this date (inclusive). Format: YYYY-MM-DD."
                    },
                    "dateTo": {
                        "title": "Date to (YYYY-MM-DD)",
                        "type": "string",
                        "description": "Drop posts with `created_at` after this date (inclusive end-of-day). Format: YYYY-MM-DD."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
