# Mastodon Scraper (`dami_studio/mastodon-scraper`) Actor

Scrapes public Mastodon posts from any instance via its REST API. Returns clean text, author, engagement counts, media URLs and tags by hashtag, account (@handle), or public/federated timeline. No login or API key.

- **URL**: https://apify.com/dami\_studio/mastodon-scraper.md
- **Developed by:** [Dami's Studio](https://apify.com/dami_studio) (community)
- **Categories:** Social media, Integrations, News
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

$1.50 / 1,000 post returneds

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Mastodon Scraper

Scrape public **Mastodon** posts from any instance via its open REST API — no login, no API key, no anti-bot. Pull a **hashtag** timeline, an **account's** toots, or the instance **public/federated** timeline.

### What it does

Given an instance host and a mode, the actor calls the public Mastodon API and returns clean, structured posts:

- **Hashtag timeline** — every recent post tagged with a hashtag.
- **Account toots** — resolves an `@handle` to its account, then pulls that account's statuses (reblogs included). For a reblog/boost row, the `author` and `authorName` fields show the **reblogger** (the account whose timeline it appeared on), while the `text`, `mediaUrls` and `tags` come from the original boosted post.
- **Public timeline** — the instance's public timeline; toggle `local` for the local-only vs. federated view.

It pages automatically (40 posts per request, following the `Link` rel="next" header or `max_id`) until it reaches your `maxItems` or the timeline runs out. Post HTML is converted to plain text with a small dependency-free stripper.

### Input

| Field | Type | Default | Notes |
|-------|------|---------|-------|
| `instance` | string | `mastodon.social` | Instance host, no scheme (e.g. `fosstodon.org`). |
| `mode` | string | `hashtag` | `hashtag`, `account`, or `public`. |
| `query` | string | — | Hashtag (hashtag mode) or `@handle` (account mode). Ignored for public. |
| `local` | boolean | `false` | Public mode only: local timeline if on, federated if off. |
| `maxItems` | integer | `100` | Max posts to return. |
| `proxyConfiguration` | object | none | **Optional.** These endpoints have no anti-bot, so a proxy gives no benefit — leave it off unless you hit instance-level IP rate limits. |

#### Examples

```json
{ "instance": "mastodon.social", "mode": "hashtag", "query": "opensource", "maxItems": 100 }
````

```json
{ "instance": "mastodon.social", "mode": "account", "query": "Mastodon", "maxItems": 50 }
```

```json
{ "instance": "mastodon.social", "mode": "public", "local": true, "maxItems": 80 }
```

### Output

Each successful row:

```json
{
  "ok": true,
  "id": "string",
  "text": "HTML-stripped post text",
  "author": "user@instance",
  "authorName": "Display Name",
  "authorFollowers": 12345,
  "createdAt": "2026-06-11T12:34:56.000Z",
  "repliesCount": 3,
  "reblogsCount": 10,
  "favouritesCount": 42,
  "language": "en",
  "mediaUrls": ["https://..."],
  "tags": ["opensource"],
  "url": "https://mastodon.social/@user/123"
}
```

Posts are deduplicated by `id`. You are charged one **post** event per returned row. Diagnostic rows (`ok: false`) and empty/blocked runs are **never** charged.

**Nullable fields:** `author`, `authorName`, `authorFollowers`, `language`, and `url` can be `null` when the instance omits them for a given post (e.g. an account with a hidden display name, or a status without a canonical URL yet). `mediaUrls` and `tags` are always arrays (possibly empty); count fields default to `0`.

### Diagnostics

The actor never silently returns nothing. On a problem it pushes a single diagnostic row (`ok: false`) with an `errorCode` and never charges for it:

- `BAD_INPUT` — the input was invalid (unknown `mode`, or `hashtag`/`account` mode with no `query`). The row's `error`/`hint` explain how to fix it; the run finishes cleanly without charging.
- `NOT_FOUND` — account/handle or instance not found.
- `NO_RESULTS` — the request succeeded but the timeline was empty.
- `RATE_LIMITED` / `SERVER_ERROR` / `BLOCKED` — transient target issues (retried with backoff first).
- `NETWORK` — could not reach the instance.

#### Troubleshooting

- **Got `BLOCKED` on `public` mode?** Some large instances (notably **mastodon.social**) gate the public/federated timeline behind authentication for unauthenticated callers, so the actor surfaces a `BLOCKED` diagnostic. Workarounds: use `hashtag` or `account` mode instead, point `instance` at an instance that leaves its public timeline open (e.g. `fosstodon.org`, `mas.to`), or enable a residential proxy if the block is purely IP-based.
- **Got `BAD_INPUT`?** Check the `error`/`hint` fields in the diagnostic row — usually a missing `query` in hashtag/account mode.

### Notes

- Sends `User-Agent: dami-studios-actor`.
- No third-party HTML parser — `content` HTML is converted to text in-house.
- Works against any Mastodon-compatible instance that exposes the standard public timelines.

# Actor input Schema

## `instance` (type: `string`):

The Mastodon instance host to scrape, without the scheme (e.g. "mastodon.social", "fosstodon.org", "mas.to"). The actor calls that instance's public REST API.

## `mode` (type: `string`):

What to scrape: "hashtag" timeline (set query to the hashtag), "account" toots (set query to the @handle), or the "public" / federated timeline of the instance. Note: in account mode, boosted/reblogged rows show the reblogger as author (not the original poster). Some instances (e.g. mastodon.social) require auth for the public timeline and will return a BLOCKED diagnostic in public mode.

## `query` (type: `string`):

For hashtag mode: the hashtag to fetch, with or without the leading # (e.g. "opensource"). For account mode: the handle, with or without @ (e.g. "Mastodon" or "user@otherinstance.social"). Ignored in public mode.

## `local` (type: `boolean`):

Public mode only. If on, returns only posts originating on this instance (local timeline). If off, returns the federated timeline (posts from across the fediverse). Ignored in hashtag and account modes.

## `maxItems` (type: `integer`):

Maximum number of posts to return. The actor pages through the timeline (40 per request) until it reaches this limit or runs out of posts.

## `notionConnector` (type: `string`):

Optional. Write each post as a page into your Notion when the run finishes. Authorize a Notion connector once in Settings → API & Integrations → MCP connectors, then pick it here. Leave empty to skip (default) — results are always saved to the dataset regardless.

## `notionParentId` (type: `string`):

Optional. The Notion data source ID of the database to write into (only used if a Notion connector is set). Leave empty to create the pages privately in your workspace instead.

## `proxyConfiguration` (type: `object`):

OPTIONAL. The Mastodon public REST API has no anti-bot, so a proxy is not needed and Apify Proxy gives no benefit here. Leave this off (default) unless you hit IP-based rate limits on a specific instance, in which case enabling a proxy can help.

## Actor input object example

```json
{
  "instance": "mastodon.social",
  "mode": "hashtag",
  "query": "opensource",
  "local": false,
  "maxItems": 100,
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}
```

# Actor output Schema

## `results` (type: `string`):

Scraped rows are stored in the default dataset (one row per result). Blocked/empty/error runs return a single uncharged diagnostic row instead.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "instance": "mastodon.social",
    "query": "opensource"
};

// Run the Actor and wait for it to finish
const run = await client.actor("dami_studio/mastodon-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "instance": "mastodon.social",
    "query": "opensource",
}

# Run the Actor and wait for it to finish
run = client.actor("dami_studio/mastodon-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "instance": "mastodon.social",
  "query": "opensource"
}' |
apify call dami_studio/mastodon-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=dami_studio/mastodon-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Mastodon Scraper",
        "description": "Scrapes public Mastodon posts from any instance via its REST API. Returns clean text, author, engagement counts, media URLs and tags by hashtag, account (@handle), or public/federated timeline. No login or API key.",
        "version": "0.1",
        "x-build-id": "PVW5eZye3FaKGei2L"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/dami_studio~mastodon-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-dami_studio-mastodon-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/dami_studio~mastodon-scraper/runs": {
            "post": {
                "operationId": "runs-sync-dami_studio-mastodon-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/dami_studio~mastodon-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-dami_studio-mastodon-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "instance": {
                        "title": "Instance",
                        "type": "string",
                        "description": "The Mastodon instance host to scrape, without the scheme (e.g. \"mastodon.social\", \"fosstodon.org\", \"mas.to\"). The actor calls that instance's public REST API.",
                        "default": "mastodon.social"
                    },
                    "mode": {
                        "title": "Mode",
                        "enum": [
                            "hashtag",
                            "account",
                            "public"
                        ],
                        "type": "string",
                        "description": "What to scrape: \"hashtag\" timeline (set query to the hashtag), \"account\" toots (set query to the @handle), or the \"public\" / federated timeline of the instance. Note: in account mode, boosted/reblogged rows show the reblogger as author (not the original poster). Some instances (e.g. mastodon.social) require auth for the public timeline and will return a BLOCKED diagnostic in public mode.",
                        "default": "hashtag"
                    },
                    "query": {
                        "title": "Query (hashtag or @handle)",
                        "type": "string",
                        "description": "For hashtag mode: the hashtag to fetch, with or without the leading # (e.g. \"opensource\"). For account mode: the handle, with or without @ (e.g. \"Mastodon\" or \"user@otherinstance.social\"). Ignored in public mode."
                    },
                    "local": {
                        "title": "Local timeline only (public mode)",
                        "type": "boolean",
                        "description": "Public mode only. If on, returns only posts originating on this instance (local timeline). If off, returns the federated timeline (posts from across the fediverse). Ignored in hashtag and account modes.",
                        "default": false
                    },
                    "maxItems": {
                        "title": "Max posts",
                        "minimum": 1,
                        "maximum": 2000,
                        "type": "integer",
                        "description": "Maximum number of posts to return. The actor pages through the timeline (40 per request) until it reaches this limit or runs out of posts.",
                        "default": 100
                    },
                    "notionConnector": {
                        "title": "Notion connector (optional)",
                        "type": "string",
                        "description": "Optional. Write each post as a page into your Notion when the run finishes. Authorize a Notion connector once in Settings → API & Integrations → MCP connectors, then pick it here. Leave empty to skip (default) — results are always saved to the dataset regardless."
                    },
                    "notionParentId": {
                        "title": "Notion target data source ID",
                        "type": "string",
                        "description": "Optional. The Notion data source ID of the database to write into (only used if a Notion connector is set). Leave empty to create the pages privately in your workspace instead."
                    },
                    "proxyConfiguration": {
                        "title": "Proxy",
                        "type": "object",
                        "description": "OPTIONAL. The Mastodon public REST API has no anti-bot, so a proxy is not needed and Apify Proxy gives no benefit here. Leave this off (default) unless you hit IP-based rate limits on a specific instance, in which case enabling a proxy can help.",
                        "default": {
                            "useApifyProxy": false
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
