Mastodon Hashtag & Account Scraper
Pricing
Pay per usage
Mastodon Hashtag & Account Scraper
Scrape public Mastodon instances for hashtag and account timelines without authentication. Multi-instance fallback, federated coverage warnings, and engagement / media / language metadata in every status row.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
太郎 山田
Maintained by CommunityActor stats
0
Bookmarked
3
Total users
2
Monthly active users
14 days ago
Last modified
Categories
Share
Scrape public Mastodon instances for hashtag and account timelines without authentication. Multi-instance fallback covers federated reach gaps, and every status row includes engagement counts, media attachments, and language tags.
Useful for brand monitoring, social listening, journalism, and OSINT — particularly post-2026 as Twitter/X has fully closed its free API.
Compliance & intended use
This actor uses only the Mastodon public API (/api/v1/timelines/tag, /api/v1/timelines/public, /api/v1/accounts/{id}/statuses, /api/v1/accounts/lookup). No authentication required, no private content accessed, no AGPL obligation imposed (AGPL applies to server software, not API consumers — see docs/source-compliance.md).
Rate-limit handling: the actor paces requests at ≥ 250 ms per instance to stay within the public 300 requests / 5 minutes ceiling documented by Mastodon.
Per-instance ToS may vary. The actor surfaces a coverage_warning when an instance returns 401/403/404/5xx on a public endpoint (admin may have disabled it) instead of retrying or pretending to have data.
Quickstart
Provide one or more of instances, accounts, or hashtags. Results land in the Apify Dataset (or POST to a webhook).
{"instances": ["fosstodon.org", "hachyderm.io"],"hashtags": ["fediverse", "rustlang"],"maxStatusesPerSource": 40}
Input examples
Example 1 — hashtag scan across 3 instances (federated coverage)
{"instances": ["fosstodon.org", "hachyderm.io", "infosec.exchange"],"hashtags": ["security", "ciso"],"maxStatusesPerSource": 40,"delivery": "dataset"}
Example 2 — track a specific account
{"accounts": ["Gargron@mastodon.social"],"maxStatusesPerSource": 50,"delivery": "dataset"}
Example 3 — instance public timeline + webhook delivery
{"instances": ["fosstodon.org"],"maxStatusesPerSource": 40,"delivery": "webhook","webhookUrl": "https://your-listener.example.com/mastodon"}
Output
Each row is a normalized status:
| Field | Type | Description |
|---|---|---|
statusId | string | Mastodon status ID on the source instance. |
instanceHost | string | The instance that served this status. |
accountHandle | string | user@instance form. |
accountDisplayName | string | Display name. |
content | string | HTML content. |
contentText | string | Plain-text convenience extract. |
createdAt | string | ISO timestamp. |
language | string | null | BCP-47 tag declared by the author. |
repliesCount / reblogsCount / favouritesCount | integer | Engagement counts (per-instance, not federated total). |
mediaAttachments | array | Each attachment: { type, url, description }. |
tags | array | Hashtag strings. |
mentions | array | Mentioned account handles. |
visibility | string | public / unlisted (private/direct never reached). |
url | string | Canonical status URL. |
coverage_warning | object | null | Set when an instance / endpoint was unreachable; explains why this row is degraded or absent. |
Sample output
{"meta": {"generatedAt": "2026-05-08T09:00:00.000Z","instanceCount": 3,"statusCount": 87,"warnings": []},"statuses": [{"statusId": "111234567890123456","instanceHost": "fosstodon.org","accountHandle": "alice@fosstodon.org","accountDisplayName": "Alice","content": "<p>Just published a write-up on <a href=\"https://fosstodon.org/tags/security\">#security</a>...</p>","createdAt": "2026-05-08T08:42:11.000Z","language": "en","repliesCount": 3,"reblogsCount": 12,"favouritesCount": 28,"mediaAttachments": [],"tags": ["security"],"mentions": [],"visibility": "public","url": "https://fosstodon.org/@alice/111234567890123456"}]}
Federated coverage caveats
- A hashtag query on
instance Aonly returns statuses thatAhas federated. To approximate global coverage, query 3–5 active instances (Mastodon does not have a centralised hashtag index). - Different Fediverse software (Pleroma, Akkoma, Misskey, GoToSocial) exposes the same v1 surface but with implementation differences. The actor targets the Mastodon-compatible subset; non-compliant responses surface as
coverage_warning. - Instance admins may disable public endpoints. The actor records this rather than guessing.
- Reblog/favourite counts are per-instance, not Fediverse-wide.
Tips
- For brand monitoring, query 3–5 instances most active in your audience's region (e.g.
mstdn.jpfor Japanese audiences,fosstodon.orgfor tech,infosec.exchangefor security). - Hashtag timelines are richer than account-only scans for trend detection.
- Use
webhookdelivery to push aggregates into research / analytics pipelines.
Related actors
Social listening cluster — adjacent Apify scrapers from this account:
- Subreddit & Comment Scraper — Public Reddit content with a similar query/result shape.
Compliance reference
See docs/source-compliance.md for full rate-limit, AGPL, and per-instance ToS handling.
⭐ Was this helpful?
If this actor saved you time, please leave a ★ rating on Apify Store.
Bug report or feature request? Open an issue on the Issues tab.