# Discourse Community Scraper (`crawlerbros/discourse-community-scraper`) Actor

Scrape any public Discourse forum with latest topics, trending discussions, category browsing, tag filtering, full-text search, user profiles, and complete post threads. Works with meta.discourse.org, community forums, and any self-hosted Discourse.

- **URL**: https://apify.com/crawlerbros/discourse-community-scraper.md
- **Developed by:** [Crawler Bros](https://apify.com/crawlerbros) (community)
- **Categories:** Developer tools, Automation, Other
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 7 bookmarks
- **User rating**: 5.00 out of 5 stars

## Pricing

from $3.00 / 1,000 results

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Discourse Community Scraper

Extract topics, discussions, and posts from any public [Discourse](https://www.discourse.org/) forum. Supports latest topics, trending discussions, category browsing, tag filtering, full-text search, user activity, and complete post threads — all via the standard Discourse REST API.

Works with [meta.discourse.org](https://meta.discourse.org), [community.cloudflare.com](https://community.cloudflare.com), and any self-hosted Discourse instance.

### What It Does

- **Latest topics** — fetch the most recently active discussions
- **Top/trending topics** — get the most popular topics by time period (daily/weekly/monthly/yearly)
- **Browse by category** — explore topics within a specific forum category
- **Filter by tag** — find all topics with a given tag
- **Full-text search** — search across all forum topics by keyword
- **User topics** — get all topics created by a specific user
- **Get posts** — fetch all posts/replies within a specific topic thread

### Input Schema

| Field | Type | Description | Default |
|-------|------|-------------|---------|
| `forumUrl` | text (required) | Base URL of the Discourse forum | — |
| `mode` | select | Scraping mode (see modes below) | `latestTopics` |
| `period` | select | Time period for top topics: `daily`, `weekly`, `monthly`, `quarterly`, `yearly`, `all` | `weekly` |
| `categorySlug` | text | Category slug or path (e.g., `support`, `meta/7`) | — |
| `tag` | text | Tag name to filter by (e.g., `bug`, `feature-request`) | — |
| `searchQuery` | text | Search query for finding topics | `getting started` |
| `username` | text | Forum username for byUser mode | — |
| `topicId` | integer | Numeric topic ID for getPosts mode | — |
| `apiKey` | text (secret) | API key for private forums (optional) | — |
| `apiUsername` | text | Username associated with the API key | — |
| `maxItems` | integer | Maximum records to return (1–5000) | `50` |

#### Modes

| Mode | Description | Required Fields |
|------|-------------|-----------------|
| `latestTopics` | Most recently active topics | `forumUrl` |
| `topTopics` | Top topics by engagement | `forumUrl`, `period` |
| `byCategory` | Topics within a category | `forumUrl`, `categorySlug` |
| `byTag` | Topics with a specific tag | `forumUrl`, `tag` |
| `searchTopics` | Full-text search results | `forumUrl`, `searchQuery` |
| `byUser` | Topics created by a user | `forumUrl`, `username` |
| `getPosts` | All posts in a topic thread | `forumUrl`, `topicId` |

### Output Schema

#### Topic Record (`recordType: "discourseTopic"`)

| Field | Type | Description |
|-------|------|-------------|
| `topicId` | integer | Unique topic ID |
| `title` | string | Topic title |
| `slug` | string | URL-friendly topic slug |
| `url` | string | Full URL to the topic |
| `postsCount` | integer | Total number of posts in the topic |
| `views` | integer | Number of views |
| `likeCount` | integer | Total likes received |
| `replyCount` | integer | Number of replies |
| `createdAt` | string | ISO 8601 creation timestamp |
| `lastPostedAt` | string | ISO 8601 timestamp of most recent post |
| `categoryId` | integer | Category ID |
| `categoryName` | string | Human-readable category name |
| `tags` | array | List of tag names |
| `authorUsername` | string | Username of the original poster |
| `authorName` | string | Display name of the original poster |
| `excerpt` | string | Short excerpt from the first post |
| `solved` | boolean | Whether the topic has an accepted answer |
| `pinned` | boolean | Whether the topic is pinned |
| `highestPostNumber` | integer | Highest post number in the thread |
| `wordCount` | integer | Total word count of the topic |
| `forumUrl` | string | Base URL of the source forum |
| `recordType` | string | Always `discourseTopic` |
| `scrapedAt` | string | ISO 8601 scrape timestamp |

#### Post Record (`recordType: "discoursePost"`)

| Field | Type | Description |
|-------|------|-------------|
| `postId` | integer | Unique post ID |
| `topicId` | integer | Parent topic ID |
| `postNumber` | integer | Position in the thread |
| `url` | string | Direct URL to the post |
| `authorUsername` | string | Post author's username |
| `authorName` | string | Post author's display name |
| `createdAt` | string | ISO 8601 creation timestamp |
| `updatedAt` | string | ISO 8601 last update timestamp |
| `cooked` | string | Rendered HTML content |
| `raw` | string | Raw markdown content |
| `likeCount` | integer | Number of likes on this post |
| `replyCount` | integer | Number of replies to this post |
| `isAcceptedAnswer` | boolean | Whether this post is the accepted solution |
| `recordType` | string | Always `discoursePost` |
| `scrapedAt` | string | ISO 8601 scrape timestamp |

### Example Output

#### Topic

```json
{
  "topicId": 12345,
  "title": "How to set up email notifications?",
  "slug": "how-to-set-up-email-notifications",
  "url": "https://meta.discourse.org/t/how-to-set-up-email-notifications/12345",
  "postsCount": 8,
  "views": 1250,
  "likeCount": 23,
  "replyCount": 7,
  "createdAt": "2024-03-15T09:30:00.000Z",
  "categoryName": "Support",
  "tags": ["email", "notifications", "configuration"],
  "authorUsername": "john_doe",
  "solved": true,
  "forumUrl": "https://meta.discourse.org",
  "recordType": "discourseTopic",
  "scrapedAt": "2026-05-15T10:00:00+00:00"
}
````

#### Post

```json
{
  "postId": 56789,
  "topicId": 12345,
  "postNumber": 3,
  "url": "https://meta.discourse.org/t/how-to-set-up-email-notifications/12345/3",
  "authorUsername": "admin_user",
  "cooked": "<p>Go to Admin → Settings → Email and configure your SMTP server.</p>",
  "likeCount": 15,
  "isAcceptedAnswer": true,
  "recordType": "discoursePost",
  "scrapedAt": "2026-05-15T10:00:00+00:00"
}
```

### FAQ

**Does it work with any Discourse forum?**
Yes. The scraper uses the standard Discourse REST API (all `.json` endpoints) that is available on every Discourse instance — both hosted and self-hosted. Simply provide the forum's base URL.

**Do I need an API key?**
No API key is needed for public forums. The `apiKey` and `apiUsername` fields are only required if the forum restricts access to logged-in users.

**How do I find the category slug?**
Look at the forum URL when browsing a category. For example, `https://meta.discourse.org/c/support/6` uses the slug `support` or the path `support/6`.

**How many posts can I get from a topic?**
Using `mode=getPosts`, the scraper fetches all posts in a topic, paginating through the full thread. Set `maxItems` to control the upper limit.

**Can I scrape a private Discourse forum?**
Yes, if you have valid API credentials. Provide your API key in the `apiKey` field and your username in `apiUsername`. The forum administrator generates API keys in the admin panel.

**What's the difference between `postsCount` and `replyCount`?**
`postsCount` is the total number of posts in the thread (including the original post). `replyCount` counts only replies made to other posts within the thread.

**How does the solved/accepted answer work?**
Discourse has an optional "Solved" plugin that lets users mark a reply as the accepted answer. If a forum uses this plugin, `solved=true` appears on topics that have an accepted answer, and `isAcceptedAnswer=true` appears on the specific accepted post.

**Can I search across multiple forums in one run?**
Each run targets one forum URL. To scrape multiple forums, run the actor multiple times with different `forumUrl` values.

# Actor input Schema

## `forumUrl` (type: `string`):

Base URL of the Discourse forum to scrape (e.g. https://meta.discourse.org or https://community.myapp.com).

## `mode` (type: `string`):

What data to scrape from the forum.

## `period` (type: `string`):

Time period for top topics ranking.

## `categorySlug` (type: `string`):

Category slug or path (e.g. 'support', 'meta/7'). Find it in the forum URL when browsing a category.

## `tag` (type: `string`):

Tag name to filter topics by (e.g. 'bug', 'feature-request'). Used in byTag mode.

## `searchQuery` (type: `string`):

Full-text search query for finding topics.

## `username` (type: `string`):

Forum username to get topics created by this user.

## `topicId` (type: `integer`):

Numeric topic ID to fetch all posts from (visible in the topic URL). Used in getPosts mode.

## `apiKey` (type: `string`):

Discourse API key for accessing private forums. Leave empty for public forums.

## `apiUsername` (type: `string`):

Discourse username associated with the API key.

## `maxItems` (type: `integer`):

Maximum number of records to return.

## Actor input object example

```json
{
  "forumUrl": "https://meta.discourse.org",
  "mode": "latestTopics",
  "period": "weekly",
  "searchQuery": "getting started",
  "maxItems": 50
}
```

# Actor output Schema

## `topics` (type: `string`):

Dataset containing all scraped Discourse topics or posts.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "forumUrl": "https://meta.discourse.org",
    "mode": "latestTopics",
    "period": "weekly",
    "searchQuery": "getting started",
    "maxItems": 50
};

// Run the Actor and wait for it to finish
const run = await client.actor("crawlerbros/discourse-community-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "forumUrl": "https://meta.discourse.org",
    "mode": "latestTopics",
    "period": "weekly",
    "searchQuery": "getting started",
    "maxItems": 50,
}

# Run the Actor and wait for it to finish
run = client.actor("crawlerbros/discourse-community-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "forumUrl": "https://meta.discourse.org",
  "mode": "latestTopics",
  "period": "weekly",
  "searchQuery": "getting started",
  "maxItems": 50
}' |
apify call crawlerbros/discourse-community-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=crawlerbros/discourse-community-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Discourse Community Scraper",
        "description": "Scrape any public Discourse forum with latest topics, trending discussions, category browsing, tag filtering, full-text search, user profiles, and complete post threads. Works with meta.discourse.org, community forums, and any self-hosted Discourse.",
        "version": "1.0",
        "x-build-id": "ScYRdnOa1ab9svv6c"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/crawlerbros~discourse-community-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-crawlerbros-discourse-community-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/crawlerbros~discourse-community-scraper/runs": {
            "post": {
                "operationId": "runs-sync-crawlerbros-discourse-community-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/crawlerbros~discourse-community-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-crawlerbros-discourse-community-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "forumUrl"
                ],
                "properties": {
                    "forumUrl": {
                        "title": "Forum URL",
                        "type": "string",
                        "description": "Base URL of the Discourse forum to scrape (e.g. https://meta.discourse.org or https://community.myapp.com)."
                    },
                    "mode": {
                        "title": "Mode",
                        "enum": [
                            "latestTopics",
                            "topTopics",
                            "byCategory",
                            "byTag",
                            "searchTopics",
                            "byUser",
                            "getPosts"
                        ],
                        "type": "string",
                        "description": "What data to scrape from the forum.",
                        "default": "latestTopics"
                    },
                    "period": {
                        "title": "Period (topTopics only)",
                        "enum": [
                            "daily",
                            "weekly",
                            "monthly",
                            "quarterly",
                            "yearly",
                            "all"
                        ],
                        "type": "string",
                        "description": "Time period for top topics ranking.",
                        "default": "weekly"
                    },
                    "categorySlug": {
                        "title": "Category slug (byCategory)",
                        "type": "string",
                        "description": "Category slug or path (e.g. 'support', 'meta/7'). Find it in the forum URL when browsing a category."
                    },
                    "tag": {
                        "title": "Tag (byTag)",
                        "type": "string",
                        "description": "Tag name to filter topics by (e.g. 'bug', 'feature-request'). Used in byTag mode."
                    },
                    "searchQuery": {
                        "title": "Search query (searchTopics)",
                        "type": "string",
                        "description": "Full-text search query for finding topics."
                    },
                    "username": {
                        "title": "Username (byUser)",
                        "type": "string",
                        "description": "Forum username to get topics created by this user."
                    },
                    "topicId": {
                        "title": "Topic ID (getPosts)",
                        "minimum": 1,
                        "maximum": 9999999,
                        "type": "integer",
                        "description": "Numeric topic ID to fetch all posts from (visible in the topic URL). Used in getPosts mode."
                    },
                    "apiKey": {
                        "title": "API key (optional)",
                        "type": "string",
                        "description": "Discourse API key for accessing private forums. Leave empty for public forums."
                    },
                    "apiUsername": {
                        "title": "API username (required if API key provided)",
                        "type": "string",
                        "description": "Discourse username associated with the API key."
                    },
                    "maxItems": {
                        "title": "Max items",
                        "minimum": 1,
                        "maximum": 5000,
                        "type": "integer",
                        "description": "Maximum number of records to return.",
                        "default": 50
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
