Lemmy Scraper - Federated Reddit Alternative avatar

Lemmy Scraper - Federated Reddit Alternative

Pricing

from $0.30 / 1,000 results

Go to Apify Store
Lemmy Scraper - Federated Reddit Alternative

Lemmy Scraper - Federated Reddit Alternative

Scrape posts and comments from any Lemmy instance (the open, federated Reddit alternative). Filter by community, search keyword, or pull instance-wide feeds. No login required. Built for AI training datasets, fediverse research, and community monitoring.

Pricing

from $0.30 / 1,000 results

Rating

0.0

(0)

Developer

NIJ KANANI

NIJ KANANI

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

12 days ago

Last modified

Categories

Share

🐭 Lemmy Scraper

Scrape posts and comments from any Lemmy instance β€” the federated, open-source Reddit alternative. No login. No rate-limit nightmares. Works with lemmy.world, lemmy.ml, beehaw.org, sh.itjust.works, and any other instance.

🎯 Built for AI/LLM training datasets, fediverse research, brand monitoring on emerging platforms, and Reddit alternatives analysis.


✨ What you can do

  • 🏘️ Community posts β€” pull all posts from one or many communities
  • πŸ” Search β€” keyword search across an instance
  • 🌐 Instance feed β€” top/hot/new across the whole instance
  • πŸ’¬ Optional comment trees β€” flattened with paths for tree reconstruction
  • πŸ” Sort options β€” Hot, Active, New, Top (multiple ranges), MostComments
  • 🌍 Cross-instance federation aware (asklemmy@lemmy.ml)

πŸš€ Quick start

{
"instance": "lemmy.world",
"mode": "community",
"communities": ["technology@lemmy.world", "asklemmy@lemmy.ml"],
"sort": "Top",
"topRange": "TopWeek",
"maxItems": 200
}

πŸ“₯ Input

FieldDescription
instanceHostname (e.g. lemmy.world)
modecommunity / search / instance
communitiesNames like tech or tech@lemmy.world
searchQueriesKeywords
sortHot, Active, New, Top, MostComments, NewComments
topRangeWhen sort = Top: TopHour … TopAll
maxItemsCap per target
includeCommentsFetch comment trees

πŸ“€ Output (per post)

{
"instance": "lemmy.world",
"community": "technology",
"title": "Some headline",
"body": "Body text or empty",
"creator": "username",
"creatorActor": "https://lemmy.world/u/username",
"score": 123,
"upvotes": 130,
"downvotes": 7,
"comments": 42,
"publishedAt": "2026-04-15T...",
"url": "https://example.com/article",
"thumbnailUrl": "https://...",
"nsfw": false,
"apId": "https://lemmy.world/post/123456",
"postUrl": "https://lemmy.world/post/123456",
"commentsList": [
{
"id": 9999,
"creator": "commenter",
"content": "Reply text",
"score": 12,
"publishedAt": "...",
"path": "0.123.456"
}
]
}

🎯 Use cases

WhoWhy
πŸ€– AI/LLM teamsReddit-style training data without Reddit's API gate
πŸ“š ResearchersFederation studies, online community migration patterns
πŸ“Š MarketersTrack brand mentions on emerging platforms
πŸ“° JournalistsSource mining on Reddit-alternative communities

βš™οΈ Tech notes

  • Uses Lemmy's official /api/v3 REST endpoints β€” fully open, no key required
  • Federation-aware: community@instance syntax works for any cross-instance pull
  • Pagination via page parameter; auto-stops when no new posts returned
  • Comment trees fetched separately and capped per post for performance