# Bluesky Scraper (`opendata-labs/bluesky-scraper`) Actor

Scrape Bluesky via the official AT Protocol API. Export posts, profiles, user feeds and search results to JSON, CSV or
Excel. For brand monitoring, social media research and AI datasets.

- **URL**: https://apify.com/opendata-labs/bluesky-scraper.md
- **Developed by:** [Joao Paulo](https://apify.com/opendata-labs) (community)
- **Categories:** Social media, News, AI
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Bluesky Scraper

**Scrape Bluesky posts, profiles and users at scale.** Built directly on the official AT Protocol API for stable, reliable social media data. Most modes need no login — only post search uses a free Bluesky app password.

### What it does

Bluesky Scraper extracts public data from [Bluesky](https://bsky.app) and exports it as clean, structured rows (JSON, CSV, Excel). It covers four modes so you can pull exactly the data you need:

- **Search posts** — find every public post matching a keyword or phrase. *(Requires a free Bluesky app password — Bluesky requires login to search posts.)*
- **User posts** — collect the full author feed for one or more handles. *(No login.)*
- **Profiles** — fetch profile details (followers, bio, post counts) for a list of handles. *(No login.)*
- **Search users** — discover accounts matching a query. *(No login.)*

No browser automation, no scraping of HTML — every request goes through the official AT Protocol API, so results are fast and consistent.

### Features

- Four scrape modes (`search_posts`, `user_posts`, `profile`, `search_users`).
- Automatic pagination via cursors up to your `maxItems` limit.
- Flattened, ready-to-use output — no raw nested AT Protocol objects to untangle.
- Direct `bsky.app` post URLs derived for every post.
- Polite request pacing plus automatic retries on transient errors.
- Pay-per-result friendly and export to JSON, CSV, Excel, or via API.

### Input

| Field      | Type           | Description                                                                 |
| ---------- | -------------- | --------------------------------------------------------------------------- |
| `mode`     | enum           | One of `search_posts`, `user_posts`, `profile`, `search_users`.             |
| `query`    | string         | Search term. Required for `search_posts` and `search_users`.                |
| `handles`  | array of string| Handles (e.g. `bsky.app`). Required for `user_posts` and `profile`.         |
| `maxItems` | integer        | Max items to scrape across all inputs. Default `1000`.                       |
| `identifier` | string       | Your Bluesky handle/email. Required **only** for `search_posts`.             |
| `appPassword` | string (secret)| A Bluesky app password (Settings → App Passwords). Required **only** for `search_posts`. |

#### Authentication (only for `search_posts`)

Bluesky requires a logged-in session to search posts. Create a free **app password** at **Bluesky → Settings → App Passwords** (never use your main password) and pass it as `appPassword` along with your handle as `identifier`. All other modes work without any credentials.

#### Example input

```json
{
  "mode": "search_posts",
  "query": "artificial intelligence",
  "identifier": "you.bsky.social",
  "appPassword": "xxxx-xxxx-xxxx-xxxx",
  "maxItems": 500
}
````

### Output example

Each post is flattened into a row like this:

```json
{
  "uri": "at://did:plc:z72i7hdynmk6r22z27h6tvur/app.bsky.feed.post/3kabc123xyz",
  "cid": "bafyreih...",
  "authorHandle": "bsky.app",
  "authorDisplayName": "Bluesky",
  "text": "Welcome to Bluesky!",
  "createdAt": "2025-01-10T12:34:56.000Z",
  "likeCount": 1280,
  "repostCount": 215,
  "replyCount": 64,
  "langs": ["en"],
  "url": "https://bsky.app/profile/bsky.app/post/3kabc123xyz"
}
```

Profiles are flattened to: `did`, `handle`, `displayName`, `description`, `followersCount`, `followsCount`, `postsCount`, `avatar`.

### Use cases

- **Brand monitoring** — track mentions of your product, company, or campaign in real time.
- **Market research** — measure share of voice, sentiment, and trending topics on Bluesky.
- **AI training datasets** — build clean, structured text corpora from public social posts.
- **OSINT & research** — map accounts, follower networks, and posting activity around a topic.

### Why this actor

This Bluesky scraper is built on the **official public AT Protocol API**, not fragile HTML scraping or undocumented endpoints. That means:

- **Stable** — uses the same documented endpoints the Bluesky app relies on.
- **Minimal setup** — most modes need no login; only post search uses a free app password.
- **Fast & clean** — structured JSON output ready for analysis or import.

Whether you need to scrape Bluesky posts for social media data, monitor a brand, or build datasets from the AT Protocol network, this Bluesky API actor delivers reliable results.

# Actor input Schema

## `mode` (type: `string`):

What to scrape. 'search\_posts' and 'search\_users' use the Query field. 'user\_posts' and 'profile' use the Handles field.

## `query` (type: `string`):

Search term. Required for 'search\_posts' and 'search\_users' modes. Ignored for the other modes.

## `handles` (type: `array`):

List of Bluesky handles (e.g. 'bsky.app' or 'jay.bsky.team'). Required for 'user\_posts' and 'profile' modes. In 'user\_posts' mode each handle's feed is scraped; in 'profile' mode each handle's profile is fetched.

## `maxItems` (type: `integer`):

Maximum number of items (posts or profiles) to scrape across all inputs. Pagination stops once this is reached.

## `identifier` (type: `string`):

Your Bluesky handle or email, e.g. 'you.bsky.social'. Required only for 'search\_posts' mode (Bluesky requires login to search posts). The other modes work without login.

## `appPassword` (type: `string`):

An app password created at Bluesky → Settings → App Passwords (NOT your main password). Required only for 'search\_posts' mode.

## Actor input object example

```json
{
  "mode": "user_posts",
  "query": "artificial intelligence",
  "handles": [
    "bsky.app"
  ],
  "maxItems": 1000
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "query": "artificial intelligence",
    "handles": [
        "bsky.app"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("opendata-labs/bluesky-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "query": "artificial intelligence",
    "handles": ["bsky.app"],
}

# Run the Actor and wait for it to finish
run = client.actor("opendata-labs/bluesky-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "query": "artificial intelligence",
  "handles": [
    "bsky.app"
  ]
}' |
apify call opendata-labs/bluesky-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=opendata-labs/bluesky-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Bluesky Scraper",
        "description": "Scrape Bluesky via the official AT Protocol API. Export posts, profiles, user feeds and search results to JSON, CSV or\nExcel. For brand monitoring, social media research and AI datasets.",
        "version": "0.1",
        "x-build-id": "2VV6civmWXbXs1U2H"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/opendata-labs~bluesky-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-opendata-labs-bluesky-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/opendata-labs~bluesky-scraper/runs": {
            "post": {
                "operationId": "runs-sync-opendata-labs-bluesky-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/opendata-labs~bluesky-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-opendata-labs-bluesky-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "mode"
                ],
                "properties": {
                    "mode": {
                        "title": "Scrape mode",
                        "enum": [
                            "search_posts",
                            "user_posts",
                            "profile",
                            "search_users"
                        ],
                        "type": "string",
                        "description": "What to scrape. 'search_posts' and 'search_users' use the Query field. 'user_posts' and 'profile' use the Handles field.",
                        "default": "user_posts"
                    },
                    "query": {
                        "title": "Search query",
                        "type": "string",
                        "description": "Search term. Required for 'search_posts' and 'search_users' modes. Ignored for the other modes."
                    },
                    "handles": {
                        "title": "Handles",
                        "type": "array",
                        "description": "List of Bluesky handles (e.g. 'bsky.app' or 'jay.bsky.team'). Required for 'user_posts' and 'profile' modes. In 'user_posts' mode each handle's feed is scraped; in 'profile' mode each handle's profile is fetched.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxItems": {
                        "title": "Max items",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Maximum number of items (posts or profiles) to scrape across all inputs. Pagination stops once this is reached.",
                        "default": 1000
                    },
                    "identifier": {
                        "title": "Bluesky handle (login)",
                        "type": "string",
                        "description": "Your Bluesky handle or email, e.g. 'you.bsky.social'. Required only for 'search_posts' mode (Bluesky requires login to search posts). The other modes work without login."
                    },
                    "appPassword": {
                        "title": "Bluesky app password",
                        "type": "string",
                        "description": "An app password created at Bluesky → Settings → App Passwords (NOT your main password). Required only for 'search_posts' mode."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
