Lemmy Scraper
Pricing
$1.50 / 1,000 post returneds
Lemmy Scraper
Scrapes public Lemmy posts from any instance (default lemmy.world) by front-page feed, community, or keyword search. Returns title, link, body, author, community, score, comments, votes, NSFW flag and thumbnail as JSON. Best for brand and product mon
Pricing
$1.50 / 1,000 post returneds
Rating
0.0
(0)
Developer
Dami's Studio
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 hours ago
Last modified
Categories
Share
Scrape public posts from any Lemmy instance — the federated, Reddit-style link aggregator. Browse an instance's front-page feed, pull a single community, or search by keyword. No account, no API key, no login.
Uses the public Lemmy v3 REST API, so reads are fast and clean (structured JSON, not HTML scraping).
Modes
- Feed — the instance front page (
/api/v3/post/list). Just setinstanceandsort. - Community — one community's posts. Set
mode: "community"and put the community name inquery(e.g.technology, or cross-instancetechnology@lemmy.world). - Search — search posts by keyword. Set
mode: "search"and put the term inquery(e.g.linux).
What you get per post
id, title, url (the external link the post points to, if any), body (post text; markdown, with stray HTML stripped), author, authorActor (the creator's federated actor URL), community, communityTitle, score, comments, upvotes, downvotes, nsfw, thumbnail, published (ISO), and postUrl (the permalink on the instance, e.g. https://lemmy.world/post/123).
Fields that can be null
url/thumbnail— many posts are pure text discussions with no external link or image.body— link posts often have no body text.- Any field Lemmy omits for a given post comes back
nullrather than being dropped.
Input
| Field | Notes |
|---|---|
instance | Lemmy instance host (bare domain). Default lemmy.world. |
mode | feed, community, or search. Default feed. |
query | Community name (community mode) or search term (search mode). |
sort | Hot, Active, New, TopDay, TopWeek, TopMonth, TopAll. Default Hot. |
maxItems | Max posts to return (paginated 50 at a time). Default 100. |
Output
One dataset row per post, deduped by post id. Pricing is pay-per-result: you are only charged for genuine post rows (ok: true). Rows we couldn't deliver are never charged:
- invalid input — a single
ok: falsediagnostic row witherrorCode: "BAD_INPUT"(bad instance, bad mode, or a missing community name / search term), - no posts for this feed/community/search (
NO_RESULTS), - a missing community or non-Lemmy host (
NOT_FOUND), - rate limits or network errors (
RATE_LIMITED/NETWORK).
Proxy
The Lemmy v3 REST API is public and has no anti-bot, so no proxy is required and the default runs without one (saving proxy credits). Only enable Apify Proxy if an instance rate-limits your IP at very high volume.
Troubleshooting
NOT_FOUNDin community mode? Check the community name. If it lives on another instance, use the cross-instance formname@otherinstance.tld, or setinstanceto that instance directly.NO_RESULTS? The feed/community/search genuinely returned nothing on this instance — try a differentsort, a broader search term, or a larger instance.BAD_INPUT?communityandsearchmodes both requirequery.instancemust be a bare Lemmy domain likelemmy.world.
Example
{ "instance": "lemmy.world", "mode": "community", "query": "technology", "sort": "Hot", "maxItems": 50 }
Notes
Lemmy is federated: a large instance like lemmy.world also relays content from communities hosted elsewhere. The postUrl permalink points to the instance you scraped; authorActor and the community's federated identity tell you where the content originates.