# Bluesky Scraper (`legend006/bluesky-firehose-scraper`) Actor

Scrape Bluesky (AT Protocol) posts by keyword, hashtag, author handle, or custom feed. Export likes, reposts, replies, hashtags, mentions, embeds, and full metadata as JSON or CSV. Built for AI training datasets, social analytics, brand monitoring, and trend tracking.

- **URL**: https://apify.com/legend006/bluesky-firehose-scraper.md
- **Developed by:** [NIJ KANANI](https://apify.com/legend006) (community)
- **Categories:** AI, Social media
- **Stats:** 1 total users, 0 monthly users, 0.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $0.50 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## 🦋 Bluesky Scraper

Scrape posts from **Bluesky** (the AT Protocol social network) — by keyword, hashtag, author handle, or any custom feed. No coding, no rate-limit headaches. Export to JSON, CSV, Excel, or push directly into your stack via API.

> ⚡ Built for AI dataset builders, social media analysts, brand-monitoring teams, and trend hunters who need clean, structured Bluesky data at scale.

---

### ✨ What you can do

- 🔎 **Search posts** by keyword or hashtag (`#ai`, `bitcoin`, `climate change`)
- 👤 **Pull any user's full feed** by handle (`jay.bsky.team`)
- 📡 **Subscribe to custom feeds** by AT URI
- 📅 Filter by **date range** and **language**
- 💬 Optionally include **replies**
- 📤 Get rich post metadata: text, likes, reposts, replies, embeds, hashtags, mentions, links

---

### 🎯 Use cases

| Who | Why |
|---|---|
| 🤖 AI / LLM teams | Build clean training datasets from a fast-growing social network |
| 📊 Social analytics | Track hashtags, brand mentions, sentiment over time |
| 🕵️ Competitor monitoring | Watch what competitors post, what gets engagement |
| 📰 Journalists & researchers | Archive public discourse around news events |
| 📈 Trend hunters | Find rising topics & influencers before they hit mainstream |

---

### 🚀 Quick start

1. Click **Try for free**
2. Choose a mode: `search`, `author`, or `feed`
3. Enter your search terms / handles / feed URIs
4. (Optional but recommended) Add your Bluesky handle + an [App Password](https://bsky.app/settings/app-passwords) for higher rate limits and full reliability
5. Click **Start**

That's it. Your data appears in the **Dataset** tab in seconds.

---

### 📥 Input

| Field | Type | Description |
|---|---|---|
| `mode` | enum | `search` / `author` / `feed` |
| `searchTerms` | array | Keywords or hashtags (search mode) |
| `authors` | array | Bluesky handles (author mode) |
| `feedUris` | array | AT URIs (feed mode) |
| `maxItems` | int | Cap per target (default 1000) |
| `since` / `until` | ISO date | Date range filter |
| `language` | string | 2-letter code (e.g. `en`) |
| `includeReplies` | bool | Include replies in author mode |
| `bskyHandle` | string | Optional — your Bluesky handle |
| `bskyAppPassword` | secret | Optional — your App Password |

#### Example input

```json
{
    "mode": "search",
    "searchTerms": ["#ai", "llm"],
    "maxItems": 5000,
    "since": "2026-04-01",
    "language": "en",
    "bskyHandle": "yourname.bsky.social",
    "bskyAppPassword": "xxxx-xxxx-xxxx-xxxx"
}
````

***

### 📤 Output (per post)

```json
{
    "uri": "at://did:plc:.../app.bsky.feed.post/...",
    "cid": "bafyrei...",
    "author": {
        "did": "did:plc:...",
        "handle": "username.bsky.social",
        "displayName": "Display Name",
        "avatar": "https://..."
    },
    "text": "Full post text",
    "createdAt": "2026-04-15T12:34:56.000Z",
    "indexedAt": "2026-04-15T12:34:57.000Z",
    "langs": ["en"],
    "likeCount": 42,
    "repostCount": 7,
    "replyCount": 3,
    "quoteCount": 1,
    "embed": { /* images, video, quoted posts */ },
    "tags": ["ai"],
    "mentions": ["did:plc:..."],
    "links": ["https://..."],
    "isReply": false,
    "replyParent": null,
    "replyRoot": null,
    "bskyUrl": "https://bsky.app/profile/username.bsky.social/post/abc"
}
```

***

### 💡 Why authentication?

Bluesky's public API rate-limits anonymous requests aggressively from datacenter IPs. Adding your own free [Bluesky App Password](https://bsky.app/settings/app-passwords) (NOT your main password — App Passwords are revocable single-purpose tokens) lifts limits and gives reliable, full-speed scraping. Your credentials are never stored — they're passed only to Bluesky's official servers per run.

***

### ❓ FAQ

**Is this legal?** Bluesky's public API is open by design — the AT Protocol is built around portable, public data. This Actor uses official endpoints only.

**Will my account get banned?** No. App Passwords are intended for read access. Treat normal rate limits and you'll be fine.

**Can I run this on a schedule?** Yes — use Apify's [Schedule](https://docs.apify.com/platform/schedules) feature to run hourly/daily.

**How fast is it?** Authenticated runs typically pull ~3,000 posts/minute.

***

Got questions or feature requests? Open an issue or message us.

# Actor input Schema

## `mode` (type: `string`):

What to scrape. 'search' uses keywords/hashtags. 'author' pulls a specific user's feed. 'feed' pulls a public custom feed by URI.

## `searchTerms` (type: `array`):

Keywords or hashtags to search (e.g. 'bitcoin', '#ai', 'climate change'). Used when mode = search.

## `authors` (type: `array`):

Bluesky handles, e.g. 'bsky.app' or 'jay.bsky.team'. Used when mode = author.

## `feedUris` (type: `array`):

AT Protocol feed URIs (at://...). Used when mode = feed.

## `maxItems` (type: `integer`):

Maximum number of posts to return per input target.

## `since` (type: `string`):

Only return posts created at or after this UTC date. Format: YYYY-MM-DD or full ISO 8601. Leave empty for no lower bound.

## `until` (type: `string`):

Only return posts created before this UTC date. Leave empty for no upper bound.

## `language` (type: `string`):

Two-letter language code (e.g. 'en', 'ja'). Search mode only.

## `includeReplies` (type: `boolean`):

Author mode only. Include reply posts in addition to top-level posts.

## `bskyHandle` (type: `string`):

Your Bluesky handle, e.g. 'yourname.bsky.social'. Required because Bluesky's CDN blocks unauthenticated requests from datacenter IPs. Free Bluesky account: sign up at https://bsky.app (60 seconds).

## `bskyAppPassword` (type: `string`):

An App Password (NOT your main password). Generate at https://bsky.app/settings/app-passwords (10 seconds). Format: 'xxxx-xxxx-xxxx-xxxx'. Revocable any time. We never store it.

## `proxyConfiguration` (type: `object`):

Optional. Routes outbound requests through Apify Proxy. Useful for residential IPs on paid plans. Free-tier datacenter proxies are blocked by Bluesky's CDN.

## Actor input object example

```json
{
  "mode": "search",
  "searchTerms": [
    "#ai"
  ],
  "authors": [],
  "feedUris": [],
  "maxItems": 1000,
  "includeReplies": false,
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "proxyConfiguration": {
        "useApifyProxy": false
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("legend006/bluesky-firehose-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "proxyConfiguration": { "useApifyProxy": False } }

# Run the Actor and wait for it to finish
run = client.actor("legend006/bluesky-firehose-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}' |
apify call legend006/bluesky-firehose-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=legend006/bluesky-firehose-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Bluesky Scraper",
        "description": "Scrape Bluesky (AT Protocol) posts by keyword, hashtag, author handle, or custom feed. Export likes, reposts, replies, hashtags, mentions, embeds, and full metadata as JSON or CSV. Built for AI training datasets, social analytics, brand monitoring, and trend tracking.",
        "version": "0.1",
        "x-build-id": "ZcJ48uoGAVQfDVLps"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/legend006~bluesky-firehose-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-legend006-bluesky-firehose-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/legend006~bluesky-firehose-scraper/runs": {
            "post": {
                "operationId": "runs-sync-legend006-bluesky-firehose-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/legend006~bluesky-firehose-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-legend006-bluesky-firehose-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "mode",
                    "bskyHandle",
                    "bskyAppPassword"
                ],
                "properties": {
                    "mode": {
                        "title": "Scrape mode",
                        "enum": [
                            "search",
                            "author",
                            "feed"
                        ],
                        "type": "string",
                        "description": "What to scrape. 'search' uses keywords/hashtags. 'author' pulls a specific user's feed. 'feed' pulls a public custom feed by URI.",
                        "default": "search"
                    },
                    "searchTerms": {
                        "title": "Search terms",
                        "type": "array",
                        "description": "Keywords or hashtags to search (e.g. 'bitcoin', '#ai', 'climate change'). Used when mode = search.",
                        "default": [
                            "#ai"
                        ],
                        "items": {
                            "type": "string"
                        }
                    },
                    "authors": {
                        "title": "Author handles",
                        "type": "array",
                        "description": "Bluesky handles, e.g. 'bsky.app' or 'jay.bsky.team'. Used when mode = author.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "feedUris": {
                        "title": "Feed URIs",
                        "type": "array",
                        "description": "AT Protocol feed URIs (at://...). Used when mode = feed.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxItems": {
                        "title": "Max items",
                        "minimum": 1,
                        "maximum": 100000,
                        "type": "integer",
                        "description": "Maximum number of posts to return per input target.",
                        "default": 1000
                    },
                    "since": {
                        "title": "Since (ISO date)",
                        "type": "string",
                        "description": "Only return posts created at or after this UTC date. Format: YYYY-MM-DD or full ISO 8601. Leave empty for no lower bound."
                    },
                    "until": {
                        "title": "Until (ISO date)",
                        "type": "string",
                        "description": "Only return posts created before this UTC date. Leave empty for no upper bound."
                    },
                    "language": {
                        "title": "Language filter",
                        "type": "string",
                        "description": "Two-letter language code (e.g. 'en', 'ja'). Search mode only."
                    },
                    "includeReplies": {
                        "title": "Include replies",
                        "type": "boolean",
                        "description": "Author mode only. Include reply posts in addition to top-level posts.",
                        "default": false
                    },
                    "bskyHandle": {
                        "title": "Bluesky handle ⚠️ Required",
                        "type": "string",
                        "description": "Your Bluesky handle, e.g. 'yourname.bsky.social'. Required because Bluesky's CDN blocks unauthenticated requests from datacenter IPs. Free Bluesky account: sign up at https://bsky.app (60 seconds)."
                    },
                    "bskyAppPassword": {
                        "title": "Bluesky App Password ⚠️ Required",
                        "type": "string",
                        "description": "An App Password (NOT your main password). Generate at https://bsky.app/settings/app-passwords (10 seconds). Format: 'xxxx-xxxx-xxxx-xxxx'. Revocable any time. We never store it."
                    },
                    "proxyConfiguration": {
                        "title": "Apify Proxy configuration (optional)",
                        "type": "object",
                        "description": "Optional. Routes outbound requests through Apify Proxy. Useful for residential IPs on paid plans. Free-tier datacenter proxies are blocked by Bluesky's CDN.",
                        "default": {
                            "useApifyProxy": false
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
