# LightBurn Forum Scrapper (`zhenyu_towne/lightburn-forum-scrapper`) Actor

LightBurn Forum Crawler extracts LightBurn forum topics, posts, and replies into clean, flat CSV/JSON records for semantic analysis, with one row per post or comment including type, original IDs, author, cleaned text, URLs, timestamps, likes, source, and matched keyword when applicable.

- **URL**: https://apify.com/zhenyu\_towne/lightburn-forum-scrapper.md
- **Developed by:** [Zhenyu Towne](https://apify.com/zhenyu_towne) (community)
- **Categories:** Social media, E-commerce, Automation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $0.05 / 1,000 results

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## LightBurn Forum Semantic Crawler

Extract clean, semantic-analysis-ready posts and comments from the LightBurn Software forum.

This Actor crawls LightBurn forum topics through the public Discourse JSON API and exports one flat dataset row per original post or reply. The output is designed for NLP, LLM, embedding, semantic search, clustering, topic modeling, support trend analysis, and spreadsheet workflows.

### What It Does

- Crawls the latest LightBurn forum topics or searches by keyword.
- Fetches topic posts and replies.
- Converts forum HTML into clean plain text.
- Exports one row per `post` or `comment`.
- Preserves original Discourse post and topic IDs.
- Includes author, URL, timestamp, likes, source mode, and matched keyword.
- Removes image data from the main text field so exports are easier to analyze.

### Use Cases

- Build a semantic search index from LightBurn forum discussions.
- Analyze common user issues and support patterns.
- Cluster posts by topic or intent.
- Prepare forum text for embeddings or LLM classification.
- Export clean CSV or JSON data for spreadsheets and BI tools.

### Input Options

| Field | Description |
| --- | --- |
| `baseUrl` | Forum base URL. Defaults to `https://forum.lightburnsoftware.com`. |
| `keywords` | Optional keyword or comma-separated keywords. If empty, the Actor crawls latest topics. |
| `startDate` | Optional start date filter, for example `2026-01-01`. |
| `endDate` | Optional end date filter, for example `2026-01-31`. |
| `timeField` | Date field used for filtering: `created_at`, `last_posted_at`, or `bumped_at`. |
| `maxTopics` | Maximum number of topics to process. |
| `maxPages` | Maximum number of listing or search pages to scan. |
| `includeReplies` | Set to `false` to export only the original topic post. |
| `maxPostsPerTopic` | Maximum number of posts/comments exported from each topic. |
| `categoryIds` | Optional list of Discourse category IDs to include. |
| `requestDelayMillis` | Delay between forum API requests. |

### Output

Each dataset item is a single flat record.

| Field | Description |
| --- | --- |
| `recordType` | `post` for the first post in a topic, `comment` for replies. |
| `originalPostId` | Original Discourse post ID. |
| `originalTopicId` | Original Discourse topic ID. |
| `topicTitle` | Forum topic title. |
| `topicUrl` | URL of the forum topic. |
| `postUrl` | Direct URL to the post or comment. |
| `postNumber` | Post number within the topic. |
| `replyToPostNumber` | Referenced post number when the comment is a reply. |
| `authorUsername` | Forum username. |
| `authorName` | Display name when available. |
| `originalText` | Cleaned plain text extracted from the post body. |
| `createdAt` | Post creation timestamp. |
| `updatedAt` | Post update timestamp. |
| `likeCount` | Number of likes on the post. |
| `source` | `latest` or `search`. |
| `matchedKeyword` | Keyword that matched the topic in search mode. |
| `crawledAt` | Timestamp when the row was exported. |

### Example Output

```json
{
  "recordType": "comment",
  "originalPostId": 605754,
  "originalTopicId": 190079,
  "topicTitle": "Downloaded 2.1.01 and my laser will not come to full power",
  "topicUrl": "https://forum.lightburnsoftware.com/t/example-topic/190079",
  "postUrl": "https://forum.lightburnsoftware.com/t/example-topic/190079/2",
  "postNumber": 2,
  "replyToPostNumber": null,
  "authorUsername": "MikeyH",
  "authorName": "Mike Hembrey",
  "originalText": "Check your Units settings. The upgrade might have flipped the switch.",
  "createdAt": "2026-05-25T22:02:31.813Z",
  "updatedAt": "2026-05-25T22:02:31.813Z",
  "likeCount": 0,
  "source": "latest",
  "matchedKeyword": null,
  "crawledAt": "2026-05-26T03:07:28.862Z"
}
````

### Notes

This Actor is built for structured text extraction. It does not download images or include image URLs in the main semantic text field. The resulting dataset is intentionally flat so CSV and JSON exports remain easy to analyze.

# Actor input Schema

## `keywords` (type: `array`):

输入关键词。留空时抓取最新贴文。

## `startDate` (type: `string`):

开始日期，例如 2026-05-01

## `endDate` (type: `string`):

结束日期，例如 2026-05-26

## `timeField` (type: `string`):

按哪个时间字段过滤。

## `maxTopics` (type: `integer`):

最多抓取多少个主题。

## `maxPages` (type: `integer`):

最多翻多少页列表/搜索结果。

## `includeReplies` (type: `boolean`):

是否抓取帖子回复内容。

## `maxPostsPerTopic` (type: `integer`):

每个主题最多保存多少条回复。

## Actor input object example

```json
{
  "keywords": [
    "camera",
    "mac"
  ],
  "timeField": "created_at",
  "maxTopics": 100,
  "maxPages": 10,
  "includeReplies": true,
  "maxPostsPerTopic": 20
}
```

# Actor output Schema

## `results` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "keywords": [
        "camera",
        "mac"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("zhenyu_towne/lightburn-forum-scrapper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "keywords": [
        "camera",
        "mac",
    ] }

# Run the Actor and wait for it to finish
run = client.actor("zhenyu_towne/lightburn-forum-scrapper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "keywords": [
    "camera",
    "mac"
  ]
}' |
apify call zhenyu_towne/lightburn-forum-scrapper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=zhenyu_towne/lightburn-forum-scrapper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "LightBurn Forum Scrapper",
        "description": "LightBurn Forum Crawler extracts LightBurn forum topics, posts, and replies into clean, flat CSV/JSON records for semantic analysis, with one row per post or comment including type, original IDs, author, cleaned text, URLs, timestamps, likes, source, and matched keyword when applicable.",
        "version": "0.1",
        "x-build-id": "e2TLQB0kqwZIzZIJG"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/zhenyu_towne~lightburn-forum-scrapper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-zhenyu_towne-lightburn-forum-scrapper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/zhenyu_towne~lightburn-forum-scrapper/runs": {
            "post": {
                "operationId": "runs-sync-zhenyu_towne-lightburn-forum-scrapper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/zhenyu_towne~lightburn-forum-scrapper/run-sync": {
            "post": {
                "operationId": "run-sync-zhenyu_towne-lightburn-forum-scrapper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "keywords": {
                        "title": "Keywords",
                        "type": "array",
                        "description": "输入关键词。留空时抓取最新贴文。",
                        "items": {
                            "type": "string"
                        }
                    },
                    "startDate": {
                        "title": "Start date",
                        "type": "string",
                        "description": "开始日期，例如 2026-05-01"
                    },
                    "endDate": {
                        "title": "End date",
                        "type": "string",
                        "description": "结束日期，例如 2026-05-26"
                    },
                    "timeField": {
                        "title": "Time field",
                        "enum": [
                            "created_at",
                            "last_posted_at",
                            "bumped_at"
                        ],
                        "type": "string",
                        "description": "按哪个时间字段过滤。",
                        "default": "created_at"
                    },
                    "maxTopics": {
                        "title": "Max topics",
                        "minimum": 1,
                        "type": "integer",
                        "description": "最多抓取多少个主题。",
                        "default": 100
                    },
                    "maxPages": {
                        "title": "Max pages",
                        "minimum": 1,
                        "type": "integer",
                        "description": "最多翻多少页列表/搜索结果。",
                        "default": 10
                    },
                    "includeReplies": {
                        "title": "Include replies",
                        "type": "boolean",
                        "description": "是否抓取帖子回复内容。",
                        "default": true
                    },
                    "maxPostsPerTopic": {
                        "title": "Max posts per topic",
                        "minimum": 1,
                        "type": "integer",
                        "description": "每个主题最多保存多少条回复。",
                        "default": 20
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
