LightBurn Forum Scrapper avatar

LightBurn Forum Scrapper

Pricing

from $0.05 / 1,000 results

Go to Apify Store
LightBurn Forum Scrapper

LightBurn Forum Scrapper

LightBurn Forum Crawler extracts LightBurn forum topics, posts, and replies into clean, flat CSV/JSON records for semantic analysis, with one row per post or comment including type, original IDs, author, cleaned text, URLs, timestamps, likes, source, and matched keyword when applicable.

Pricing

from $0.05 / 1,000 results

Rating

0.0

(0)

Developer

Zhenyu Towne

Zhenyu Towne

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Share

LightBurn Forum Semantic Crawler

Extract clean, semantic-analysis-ready posts and comments from the LightBurn Software forum.

This Actor crawls LightBurn forum topics through the public Discourse JSON API and exports one flat dataset row per original post or reply. The output is designed for NLP, LLM, embedding, semantic search, clustering, topic modeling, support trend analysis, and spreadsheet workflows.

What It Does

  • Crawls the latest LightBurn forum topics or searches by keyword.
  • Fetches topic posts and replies.
  • Converts forum HTML into clean plain text.
  • Exports one row per post or comment.
  • Preserves original Discourse post and topic IDs.
  • Includes author, URL, timestamp, likes, source mode, and matched keyword.
  • Removes image data from the main text field so exports are easier to analyze.

Use Cases

  • Build a semantic search index from LightBurn forum discussions.
  • Analyze common user issues and support patterns.
  • Cluster posts by topic or intent.
  • Prepare forum text for embeddings or LLM classification.
  • Export clean CSV or JSON data for spreadsheets and BI tools.

Input Options

FieldDescription
baseUrlForum base URL. Defaults to https://forum.lightburnsoftware.com.
keywordsOptional keyword or comma-separated keywords. If empty, the Actor crawls latest topics.
startDateOptional start date filter, for example 2026-01-01.
endDateOptional end date filter, for example 2026-01-31.
timeFieldDate field used for filtering: created_at, last_posted_at, or bumped_at.
maxTopicsMaximum number of topics to process.
maxPagesMaximum number of listing or search pages to scan.
includeRepliesSet to false to export only the original topic post.
maxPostsPerTopicMaximum number of posts/comments exported from each topic.
categoryIdsOptional list of Discourse category IDs to include.
requestDelayMillisDelay between forum API requests.

Output

Each dataset item is a single flat record.

FieldDescription
recordTypepost for the first post in a topic, comment for replies.
originalPostIdOriginal Discourse post ID.
originalTopicIdOriginal Discourse topic ID.
topicTitleForum topic title.
topicUrlURL of the forum topic.
postUrlDirect URL to the post or comment.
postNumberPost number within the topic.
replyToPostNumberReferenced post number when the comment is a reply.
authorUsernameForum username.
authorNameDisplay name when available.
originalTextCleaned plain text extracted from the post body.
createdAtPost creation timestamp.
updatedAtPost update timestamp.
likeCountNumber of likes on the post.
sourcelatest or search.
matchedKeywordKeyword that matched the topic in search mode.
crawledAtTimestamp when the row was exported.

Example Output

{
"recordType": "comment",
"originalPostId": 605754,
"originalTopicId": 190079,
"topicTitle": "Downloaded 2.1.01 and my laser will not come to full power",
"topicUrl": "https://forum.lightburnsoftware.com/t/example-topic/190079",
"postUrl": "https://forum.lightburnsoftware.com/t/example-topic/190079/2",
"postNumber": 2,
"replyToPostNumber": null,
"authorUsername": "MikeyH",
"authorName": "Mike Hembrey",
"originalText": "Check your Units settings. The upgrade might have flipped the switch.",
"createdAt": "2026-05-25T22:02:31.813Z",
"updatedAt": "2026-05-25T22:02:31.813Z",
"likeCount": 0,
"source": "latest",
"matchedKeyword": null,
"crawledAt": "2026-05-26T03:07:28.862Z"
}

Notes

This Actor is built for structured text extraction. It does not download images or include image URLs in the main semantic text field. The resulting dataset is intentionally flat so CSV and JSON exports remain easy to analyze.