Lemmy Scraper avatar

Lemmy Scraper

Pricing

Pay per usage

Go to Apify Store
Lemmy Scraper

Lemmy Scraper

Scrape posts, comments, communities and search results from any Lemmy instance via the official API. Clean structured data (JSON/CSV), no login required.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Joao Paulo

Joao Paulo

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

Scrape posts, comments, communities, and search results from any Lemmy instance — fast, structured, and auth-free. Built on the official public Lemmy API v3 (ActivityPub / fediverse), so it works on lemmy.world or any other decentralized server.

What it does

Lemmy Scraper turns the public Lemmy REST API into clean, flat dataset rows you can export to JSON, CSV, or Excel. Point it at any instance, pick a mode, and collect data from the fediverse without logging in.

Four modes:

  • Community posts — every post in a community, sorted by Hot / Active / New / Top.
  • Search — keyword search across the instance.
  • Community info — metadata for a single community (subscribers, post count, description).
  • Post comments — all comments under a specific post.

Features

  • Works on any Lemmy instance — just change the instance input.
  • No authentication, no cookies, no tokens — pure public API reads.
  • Flattened output — clean columns, not raw nested JSON blobs.
  • Automatic pagination with a maxItems cap.
  • Polite request pacing plus automatic retries on transient HTTP errors.
  • Pay-per-result pricing friendly (PPE item-scraped events).

Input

FieldTypeDescription
modeenumcommunity_posts, search, community_info, or post_comments.
instancestringLemmy host, e.g. lemmy.world (default).
communityNamestringCommunity to scrape — name or name@instance.tld. Required for community modes.
querystringSearch keywords. Required for search mode.
postIdintegerPost ID. Required for post_comments mode.
sortenumHot, Active, New, TopDay, TopWeek (community posts).
maxItemsintegerMax rows to collect (default 1000).

Example input

{
"mode": "community_posts",
"instance": "lemmy.world",
"communityName": "technology",
"sort": "Hot",
"maxItems": 500
}

Output example

Each post becomes one flat row:

{
"id": 12345678,
"title": "Open-source project hits 1.0",
"body": "Release notes inside...",
"url": "https://example.com/release",
"creatorName": "dev_user",
"creatorActorId": "https://lemmy.world/u/dev_user",
"communityName": "technology",
"score": 842,
"upvotes": 870,
"downvotes": 28,
"commentsCount": 134,
"published": "2026-06-20T14:03:11.000Z",
"postUrl": "https://lemmy.world/post/12345678"
}

Comments and community-info modes produce their own flat schemas (content, creator, score, subscribers, etc.).

Use cases

  • OSINT & research — monitor communities and discussions across the fediverse.
  • Journalism — track emerging stories and public sentiment on decentralized platforms.
  • Brand monitoring — find mentions of your product or company via search mode.
  • AI / ML training data — collect open social text and threaded discussions at scale.

Why this actor

Lemmy exposes a stable, official public REST API backed by ActivityPub. This scraper talks to that API directly instead of fragile HTML parsing, so it keeps working through UI changes and runs against any Lemmy server in the fediverse. No login, no rate-limit gymnastics, no brittle selectors — just structured decentralized social data.


Keywords: Lemmy scraper, Lemmy API, scrape Lemmy, ActivityPub, fediverse data, decentralized social, Lemmy posts, Lemmy comments, federated Reddit alternative.