# Bluesky Scraper (`goat255/bluesky-scraper`) Actor

Scrape Bluesky posts, profiles, threads, and search results without a login. Pull a user's posts by handle, a post plus its reply thread, or keyword search results. Walks pagination up to your chosen limit.

- **URL**: https://apify.com/goat255/bluesky-scraper.md
- **Developed by:** [Goutam Soni](https://apify.com/goat255) (community)
- **Categories:** Social media, News
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Bluesky Scraper

Scrape Bluesky posts, profiles, threads, and keyword search results with no login and no API key. Give it handles, post links, or search terms and it returns clean structured rows with full engagement metrics, author details, media, and timestamps.

The Bluesky Scraper extracts public data from Bluesky at scale. Pull a user's recent posts by handle, fetch a single post with its full reply thread, or run keyword searches across the network. Pagination is walked automatically up to the limit you set, so you can collect hundreds or thousands of posts per source in one run.

### What it does

- **User posts by handle.** Collect a user's recent posts, with optional replies and a profile header row (display name, bio, follower and post counts).
- **Post threads from a URL.** Fetch any post plus its nested reply tree, flattened into rows.
- **Keyword search.** Search public posts by term, sorted by latest or top.
- **Full engagement metrics.** Likes, reposts, replies, and quotes on every post.
- **Media and links.** Photo URLs, video stream and poster image, link-card previews, and richtext links.
- **Rich post detail.** Author identity, post text, language, hashtags, mentions, reply flags, repost flags, and timestamps.
- **Automatic pagination and deduplication.** Walks multiple pages per source and drops duplicate rows so every result is unique.
- **No login, no password, no API key.** Runs against public read-only endpoints.

### Use cases

- **Brand and topic monitoring.** Track keyword searches to see who is talking about your product, niche, or industry on Bluesky.
- **Lead generation and outreach.** Pull posters around a topic and capture their handle, display name, and bio for prospecting.
- **Market and competitor research.** Analyze a competitor's posting cadence, engagement, and audience by scraping their handle (use a public placeholder like `example.bsky.social`).
- **Sentiment and trend analysis.** Collect large post samples for a query and feed text plus engagement counts into your own analysis.
- **Thread archiving.** Save a post and its full reply thread for research, support, or compliance records.

### Input

| Field | Type | Description |
|---|---|---|
| `handles` | array | Handles to pull posts from. With or without `@`, or a full profile link (e.g. `example.bsky.social`). |
| `postUrls` | array | Post links to fetch with their reply thread. |
| `searchQueries` | array | Keyword searches to run across public posts. |
| `maxItemsPerSource` | integer | Cap on items returned per handle, post, or query. Default 100. |
| `includeReplies` | boolean | Include a handle's replies, not just top-level posts. Default false. |
| `includeProfile` | boolean | Emit one profile row per handle before its posts. Default true. |
| `searchSort` | string | Search order: `latest` or `top`. |
| `concurrency` | integer | Number of sources processed in parallel. Default 5. |
| `proxyConfig` | object | Optional proxy. The public endpoints do not require one. |

#### Example input

```json
{
  "handles": ["example.bsky.social"],
  "searchQueries": ["open source"],
  "maxItemsPerSource": 200,
  "includeReplies": false,
  "includeProfile": true,
  "searchSort": "latest"
}
````

### Output

Each post is one row:

```json
{
  "type": "post",
  "uri": "at://did:plc:example/app.bsky.feed.post/abc123",
  "url": "https://bsky.app/profile/example.bsky.social/post/abc123",
  "authorHandle": "example.bsky.social",
  "authorDisplayName": "Example User",
  "authorDid": "did:plc:example",
  "likeCount": 42,
  "repostCount": 7,
  "replyCount": 3,
  "quoteCount": 1,
  "text": "An example post about open source.",
  "lang": "en",
  "tags": ["opensource"],
  "mentions": [],
  "links": ["https://example.com"],
  "thumbnail": "https://example.com/preview.jpg",
  "images": [],
  "video": null,
  "isReply": false,
  "replyToUri": null,
  "isRepost": false,
  "cid": "bafyexamplecid",
  "createdAt": "2026-06-01T12:00:00.000Z",
  "indexedAt": "2026-06-01T12:00:05.000Z"
}
```

Key fields:

- `url` is the public permalink to the post.
- `likeCount`, `repostCount`, `replyCount`, `quoteCount` are the engagement counts at scrape time.
- `thumbnail` is the best single preview image for the post (attached photo, video poster frame, or link-card preview), so the media column is filled even for link posts.
- `isReply` / `replyToUri` flag reply posts; `isRepost` flags posts that appeared in a feed via a repost.

When `includeProfile` is on, each handle also emits one profile row with `handle`, `url`, `displayName`, `did`, `followersCount`, `followsCount`, `postsCount`, `description`, `avatar`, `banner`, and `createdAt`.

### FAQ

**Do I need a Bluesky login or API key?**
No. The scraper reads public data through public endpoints. No account, password, or key is required.

**Is it free?**
The actor itself is pay-per-result, so you only pay for the rows you collect. Apify platform usage applies as normal. There is no per-run start fee.

**How many results can I get?**
Set `maxItemsPerSource` to your target. The scraper walks pagination across multiple pages until it reaches your cap or the source runs out of posts. You can collect hundreds or thousands of posts per handle or query.

**How fast is it?**
Sources run in parallel (set with `concurrency`). A few hundred posts per source typically complete in seconds to a couple of minutes depending on how many sources and pages are involved.

**Can I scrape a private account?**
No. Only public posts, profiles, and threads are available.

**What input formats does it accept for handles and posts?**
Handles can be plain (`example.bsky.social`), prefixed (`@example.bsky.social`), or a full profile URL. Post inputs accept the standard `bsky.app` post link.

# Actor input Schema

## `handles` (type: `array`):

Bluesky handles to pull recent posts from. With or without the @ prefix, or a full profile link. Example: example.bsky.social, @example.bsky.social, https://bsky.app/profile/example.bsky.social.

## `postUrls` (type: `array`):

Specific post links to fetch with their reply thread. Example: https://bsky.app/profile/example.bsky.social/post/abc123.

## `searchQueries` (type: `array`):

Keyword searches to run across Bluesky posts. Example: open source, machine learning.

## `maxItemsPerSource` (type: `integer`):

Cap on items returned per handle, post, or query. Pagination is walked across multiple pages until this is reached or the source is exhausted.

## `includeReplies` (type: `boolean`):

When on, a handle's reply posts are included alongside its top-level posts.

## `includeProfile` (type: `boolean`):

When on, each handle also emits one profile record (display name, bio, follower counts) before its posts.

## `searchSort` (type: `string`):

Sort order for search results. Latest or Top.

## `concurrency` (type: `integer`):

How many sources to process in parallel.

## `proxyConfig` (type: `object`):

Optional proxy. The public read endpoints do not require a proxy, but you may route through one if you wish.

## Actor input object example

```json
{
  "handles": [
    "bsky.app"
  ],
  "postUrls": [],
  "searchQueries": [],
  "maxItemsPerSource": 100,
  "includeReplies": false,
  "includeProfile": true,
  "searchSort": "latest",
  "concurrency": 5,
  "proxyConfig": {
    "useApifyProxy": false
  }
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "handles": [
        "bsky.app"
    ],
    "postUrls": [],
    "searchQueries": []
};

// Run the Actor and wait for it to finish
const run = await client.actor("goat255/bluesky-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "handles": ["bsky.app"],
    "postUrls": [],
    "searchQueries": [],
}

# Run the Actor and wait for it to finish
run = client.actor("goat255/bluesky-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "handles": [
    "bsky.app"
  ],
  "postUrls": [],
  "searchQueries": []
}' |
apify call goat255/bluesky-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=goat255/bluesky-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Bluesky Scraper",
        "description": "Scrape Bluesky posts, profiles, threads, and search results without a login. Pull a user's posts by handle, a post plus its reply thread, or keyword search results. Walks pagination up to your chosen limit.",
        "version": "0.1",
        "x-build-id": "3VQgThJoMK2PBNBb2"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/goat255~bluesky-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-goat255-bluesky-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/goat255~bluesky-scraper/runs": {
            "post": {
                "operationId": "runs-sync-goat255-bluesky-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/goat255~bluesky-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-goat255-bluesky-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "handles": {
                        "title": "Handles (user mode)",
                        "type": "array",
                        "description": "Bluesky handles to pull recent posts from. With or without the @ prefix, or a full profile link. Example: example.bsky.social, @example.bsky.social, https://bsky.app/profile/example.bsky.social.",
                        "default": [
                            "bsky.app"
                        ],
                        "items": {
                            "type": "string"
                        }
                    },
                    "postUrls": {
                        "title": "Post URLs (thread mode)",
                        "type": "array",
                        "description": "Specific post links to fetch with their reply thread. Example: https://bsky.app/profile/example.bsky.social/post/abc123.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "searchQueries": {
                        "title": "Search queries (search mode)",
                        "type": "array",
                        "description": "Keyword searches to run across Bluesky posts. Example: open source, machine learning.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxItemsPerSource": {
                        "title": "Max items per source",
                        "minimum": 1,
                        "maximum": 50000,
                        "type": "integer",
                        "description": "Cap on items returned per handle, post, or query. Pagination is walked across multiple pages until this is reached or the source is exhausted.",
                        "default": 100
                    },
                    "includeReplies": {
                        "title": "Include replies in user feeds",
                        "type": "boolean",
                        "description": "When on, a handle's reply posts are included alongside its top-level posts.",
                        "default": false
                    },
                    "includeProfile": {
                        "title": "Include a profile header row",
                        "type": "boolean",
                        "description": "When on, each handle also emits one profile record (display name, bio, follower counts) before its posts.",
                        "default": true
                    },
                    "searchSort": {
                        "title": "Search sort",
                        "enum": [
                            "latest",
                            "top"
                        ],
                        "type": "string",
                        "description": "Sort order for search results. Latest or Top.",
                        "default": "latest"
                    },
                    "concurrency": {
                        "title": "Concurrency",
                        "minimum": 1,
                        "maximum": 20,
                        "type": "integer",
                        "description": "How many sources to process in parallel.",
                        "default": 5
                    },
                    "proxyConfig": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Optional proxy. The public read endpoints do not require a proxy, but you may route through one if you wish.",
                        "default": {
                            "useApifyProxy": false
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
