Mastodon Hashtag & Account Scraper avatar

Mastodon Hashtag & Account Scraper

Pricing

Pay per usage

Go to Apify Store
Mastodon Hashtag & Account Scraper

Mastodon Hashtag & Account Scraper

Scrape public Mastodon instances for hashtag and account timelines without authentication. Multi-instance fallback, federated coverage warnings, and engagement / media / language metadata in every status row.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

太郎 山田

太郎 山田

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

2

Monthly active users

14 days ago

Last modified

Share

Scrape public Mastodon instances for hashtag and account timelines without authentication. Multi-instance fallback covers federated reach gaps, and every status row includes engagement counts, media attachments, and language tags.

Useful for brand monitoring, social listening, journalism, and OSINT — particularly post-2026 as Twitter/X has fully closed its free API.

Compliance & intended use

This actor uses only the Mastodon public API (/api/v1/timelines/tag, /api/v1/timelines/public, /api/v1/accounts/{id}/statuses, /api/v1/accounts/lookup). No authentication required, no private content accessed, no AGPL obligation imposed (AGPL applies to server software, not API consumers — see docs/source-compliance.md).

Rate-limit handling: the actor paces requests at ≥ 250 ms per instance to stay within the public 300 requests / 5 minutes ceiling documented by Mastodon.

Per-instance ToS may vary. The actor surfaces a coverage_warning when an instance returns 401/403/404/5xx on a public endpoint (admin may have disabled it) instead of retrying or pretending to have data.

Quickstart

Provide one or more of instances, accounts, or hashtags. Results land in the Apify Dataset (or POST to a webhook).

{
"instances": ["fosstodon.org", "hachyderm.io"],
"hashtags": ["fediverse", "rustlang"],
"maxStatusesPerSource": 40
}

Input examples

Example 1 — hashtag scan across 3 instances (federated coverage)

{
"instances": ["fosstodon.org", "hachyderm.io", "infosec.exchange"],
"hashtags": ["security", "ciso"],
"maxStatusesPerSource": 40,
"delivery": "dataset"
}

Example 2 — track a specific account

{
"accounts": ["Gargron@mastodon.social"],
"maxStatusesPerSource": 50,
"delivery": "dataset"
}

Example 3 — instance public timeline + webhook delivery

{
"instances": ["fosstodon.org"],
"maxStatusesPerSource": 40,
"delivery": "webhook",
"webhookUrl": "https://your-listener.example.com/mastodon"
}

Output

Each row is a normalized status:

FieldTypeDescription
statusIdstringMastodon status ID on the source instance.
instanceHoststringThe instance that served this status.
accountHandlestringuser@instance form.
accountDisplayNamestringDisplay name.
contentstringHTML content.
contentTextstringPlain-text convenience extract.
createdAtstringISO timestamp.
languagestring | nullBCP-47 tag declared by the author.
repliesCount / reblogsCount / favouritesCountintegerEngagement counts (per-instance, not federated total).
mediaAttachmentsarrayEach attachment: { type, url, description }.
tagsarrayHashtag strings.
mentionsarrayMentioned account handles.
visibilitystringpublic / unlisted (private/direct never reached).
urlstringCanonical status URL.
coverage_warningobject | nullSet when an instance / endpoint was unreachable; explains why this row is degraded or absent.

Sample output

{
"meta": {
"generatedAt": "2026-05-08T09:00:00.000Z",
"instanceCount": 3,
"statusCount": 87,
"warnings": []
},
"statuses": [
{
"statusId": "111234567890123456",
"instanceHost": "fosstodon.org",
"accountHandle": "alice@fosstodon.org",
"accountDisplayName": "Alice",
"content": "<p>Just published a write-up on <a href=\"https://fosstodon.org/tags/security\">#security</a>...</p>",
"createdAt": "2026-05-08T08:42:11.000Z",
"language": "en",
"repliesCount": 3,
"reblogsCount": 12,
"favouritesCount": 28,
"mediaAttachments": [],
"tags": ["security"],
"mentions": [],
"visibility": "public",
"url": "https://fosstodon.org/@alice/111234567890123456"
}
]
}

Federated coverage caveats

  • A hashtag query on instance A only returns statuses that A has federated. To approximate global coverage, query 3–5 active instances (Mastodon does not have a centralised hashtag index).
  • Different Fediverse software (Pleroma, Akkoma, Misskey, GoToSocial) exposes the same v1 surface but with implementation differences. The actor targets the Mastodon-compatible subset; non-compliant responses surface as coverage_warning.
  • Instance admins may disable public endpoints. The actor records this rather than guessing.
  • Reblog/favourite counts are per-instance, not Fediverse-wide.

Tips

  • For brand monitoring, query 3–5 instances most active in your audience's region (e.g. mstdn.jp for Japanese audiences, fosstodon.org for tech, infosec.exchange for security).
  • Hashtag timelines are richer than account-only scans for trend detection.
  • Use webhook delivery to push aggregates into research / analytics pipelines.

Social listening cluster — adjacent Apify scrapers from this account:

Compliance reference

See docs/source-compliance.md for full rate-limit, AGPL, and per-instance ToS handling.

⭐ Was this helpful?

If this actor saved you time, please leave a ★ rating on Apify Store.

Bug report or feature request? Open an issue on the Issues tab.