# Bluesky Scraper (`glassventures/bluesky-scraper`) Actor

Scrape posts from Bluesky. Extract text, author, likes, reposts, replies, images, quoted posts. Search posts or scrape profiles. No login needed.

- **URL**: https://apify.com/glassventures/bluesky-scraper.md
- **Developed by:** [Glass Ventures](https://apify.com/glassventures) (community)
- **Categories:** Social media, Marketing
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Bluesky Scraper

Scrape posts and profiles from Bluesky (bsky.app) using the public AT Protocol API. Extract text, author info, likes, reposts, replies, images, and quoted posts.

### What does Bluesky Scraper do?

Bluesky Scraper extracts public post data from the Bluesky social network. It uses the public AT Protocol API (no authentication required) to scrape user profiles and search for posts by keyword.

Whether you need to monitor brand mentions, analyze social trends, or collect public discourse data, this actor provides structured output with engagement metrics, media attachments, and quoted post content. It handles pagination automatically to collect large datasets efficiently.

Bluesky is a decentralized social network built on the AT Protocol with a rapidly growing user base. This scraper provides an easy way to access public data without needing to set up API credentials.

### Use Cases

- **Market researchers** -- Monitor brand mentions and sentiment on Bluesky
- **Data analysts** -- Analyze engagement patterns, posting frequency, and content trends
- **Journalists** -- Track public discourse and find sources on trending topics
- **Developers** -- Collect training data or build dashboards from Bluesky content

### Features

- Scrape posts from any public Bluesky profile
- Search posts by keywords or phrases
- Extract full engagement metrics (likes, reposts, replies)
- Capture images and quoted posts
- Automatic pagination for large datasets
- No authentication or API keys required
- Proxy support with automatic rotation
- Handles pagination and large datasets automatically
- Exports to JSON, CSV, Excel, or connect via API

### How much will it cost?

| Results | Estimated Cost |
|---------|---------------|
| 100     | ~$0.01        |
| 1,000   | ~$0.05        |
| 10,000  | ~$0.50        |

| Cost Component | Per 1,000 Results |
|----------------|-------------------|
| Platform compute | ~$0.05 |
| Proxy (datacenter) | ~$0.00 |
| **Total** | **~$0.05** |

Bluesky's public API is very fast and lightweight, making this one of the most cost-effective social media scrapers available.

### How to use

1. Go to the Bluesky Scraper page on Apify Store
2. Click "Start" or "Try for free"
3. Enter Bluesky profile URLs, handles, or search terms
4. Set the maximum number of posts to scrape
5. Click "Start" and wait for the results

### Input parameters

| Parameter | Type | Description | Default |
|-----------|------|-------------|---------|
| startUrls | array | Bluesky profile URLs to scrape | - |
| handles | array | Bluesky handles (e.g. bsky.app) | - |
| searchTerms | array | Search queries to find posts | - |
| maxItems | number | Max posts to return | 100 |
| proxyConfig | object | Proxy settings | Apify Proxy |

### Output

The actor produces a dataset with the following fields:

```json
{
    "url": "https://bsky.app/profile/bsky.app/post/3abc123",
    "text": "Welcome to Bluesky!",
    "author": "Bluesky",
    "handle": "bsky.app",
    "likesCount": 1500,
    "repostsCount": 300,
    "repliesCount": 85,
    "createdAt": "2024-06-15T10:30:00.000Z",
    "images": ["https://cdn.bsky.app/img/feed_fullsize/..."],
    "quotedPost": {
        "text": "Original post text",
        "author": "Original Author",
        "handle": "author.bsky.social",
        "url": "https://bsky.app/profile/author.bsky.social/post/xyz789"
    },
    "authorAvatar": "https://cdn.bsky.app/img/avatar/...",
    "postId": "3abc123",
    "scrapedAt": "2026-04-23T12:00:00.000Z"
}
````

| Field | Type | Description |
|-------|------|-------------|
| url | string | Post URL on Bluesky |
| text | string | Post text content |
| author | string | Author display name |
| handle | string | Author Bluesky handle |
| likesCount | number | Number of likes |
| repostsCount | number | Number of reposts |
| repliesCount | number | Number of replies |
| createdAt | string | Post creation timestamp (ISO 8601) |
| images | array | Image URLs attached to the post |
| quotedPost | object | Quoted/embedded post data |
| authorAvatar | string | Author avatar image URL |
| postId | string | AT Protocol post record key |
| scrapedAt | string | ISO 8601 scrape timestamp |

### Integrations

Connect Bluesky Scraper with other tools:

- **Apify API** -- REST API for programmatic access
- **Webhooks** -- get notified when a run finishes
- **Zapier / Make** -- connect to 5,000+ apps
- **Google Sheets** -- export directly to spreadsheets

#### API Example (Node.js)

```javascript
import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_TOKEN' });
const run = await client.actor('YOUR_USERNAME/bluesky-scraper').call({
    handles: ['bsky.app'],
    maxItems: 100,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
```

#### API Example (Python)

```python
from apify_client import ApifyClient
client = ApifyClient('YOUR_TOKEN')
run = client.actor('YOUR_USERNAME/bluesky-scraper').call(run_input={
    'handles': ['bsky.app'],
    'maxItems': 100,
})
items = client.dataset(run['defaultDatasetId']).list_items().items
```

#### API Example (cURL)

```bash
curl "https://api.apify.com/v2/acts/YOUR_USERNAME~bluesky-scraper/runs" \
  -X POST \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -d '{"handles": ["bsky.app"], "maxItems": 100}'
```

### Tips and tricks

- Start with a small `maxItems` (10-20) to test before running large scrapes
- Use handles directly instead of URLs for convenience
- Combine profile scraping with search terms to get comprehensive data
- The public API has generous rate limits but very large scrapes may benefit from proxy rotation

### FAQ

**Q: Does this actor require login credentials?**
A: No. Bluesky Scraper uses the public AT Protocol API which does not require authentication for public data.

**Q: How fast is the scraping?**
A: Approximately 500-1000 posts per minute depending on pagination and network conditions.

**Q: What should I do if I get blocked?**
A: Enable proxy rotation in the Proxy Configuration settings. Datacenter proxies are usually sufficient for Bluesky.

**Q: Can I scrape private/protected accounts?**
A: No. This actor only accesses publicly available data through the AT Protocol public API.

### Is it legal to scrape Bluesky?

Web scraping of publicly available data is generally legal based on precedents like the LinkedIn v. HiQ Labs case. Bluesky is built on the open AT Protocol which is designed for data interoperability. This actor only accesses publicly available data through the official public API. Always review and respect the target site's Terms of Service. For more information, see [Apify's blog on web scraping legality](https://blog.apify.com/is-web-scraping-legal/).

### Limitations

- Only public posts are accessible (no private/protected content)
- The public API may have rate limits for very high-volume requests
- Historical search results may be limited by Bluesky's search index depth
- Deleted posts cannot be retrieved

### Changelog

- **v0.1** (2026-04-23) -- Initial release

# Actor input Schema

## `startUrls` (type: `array`):

Bluesky profile URLs to scrape posts from. Example: https://bsky.app/profile/bsky.app

## `handles` (type: `array`):

Bluesky handles to scrape posts from (e.g. bsky.app, jay.bsky.team).

## `searchTerms` (type: `array`):

Search queries to find posts on Bluesky. The actor will search and scrape matching posts.

## `maxItems` (type: `integer`):

Maximum number of posts to scrape. Use 0 or leave empty for unlimited.

## `maxConcurrency` (type: `integer`):

Maximum number of requests processed in parallel.

## `debugMode` (type: `boolean`):

Enables verbose logging for debugging.

## `extendOutputFunction` (type: `string`):

A JavaScript function to customize each output item. Receives { data }.

## `proxyConfig` (type: `object`):

Select proxies to be used.

## Actor input object example

```json
{
  "startUrls": [
    {
      "url": "https://bsky.app/profile/bsky.app"
    }
  ],
  "handles": [
    "bsky.app"
  ],
  "searchTerms": [
    "bluesky"
  ],
  "maxItems": 100,
  "maxConcurrency": 10,
  "debugMode": false,
  "extendOutputFunction": "async ({ data }) => {\n    return data;\n}",
  "proxyConfig": {
    "useApifyProxy": true
  }
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        {
            "url": "https://bsky.app/profile/bsky.app"
        }
    ],
    "handles": [
        "bsky.app"
    ],
    "searchTerms": [
        "bluesky"
    ],
    "extendOutputFunction": async ({ data }) => {
        return data;
    },
    "proxyConfig": {
        "useApifyProxy": true
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("glassventures/bluesky-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "startUrls": [{ "url": "https://bsky.app/profile/bsky.app" }],
    "handles": ["bsky.app"],
    "searchTerms": ["bluesky"],
    "extendOutputFunction": """async ({ data }) => {
    return data;
}""",
    "proxyConfig": { "useApifyProxy": True },
}

# Run the Actor and wait for it to finish
run = client.actor("glassventures/bluesky-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    {
      "url": "https://bsky.app/profile/bsky.app"
    }
  ],
  "handles": [
    "bsky.app"
  ],
  "searchTerms": [
    "bluesky"
  ],
  "extendOutputFunction": "async ({ data }) => {\\n    return data;\\n}",
  "proxyConfig": {
    "useApifyProxy": true
  }
}' |
apify call glassventures/bluesky-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=glassventures/bluesky-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Bluesky Scraper",
        "description": "Scrape posts from Bluesky. Extract text, author, likes, reposts, replies, images, quoted posts. Search posts or scrape profiles. No login needed.",
        "version": "0.1",
        "x-build-id": "eS8ZgKlvWwIDdmWnD"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/glassventures~bluesky-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-glassventures-bluesky-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/glassventures~bluesky-scraper/runs": {
            "post": {
                "operationId": "runs-sync-glassventures-bluesky-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/glassventures~bluesky-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-glassventures-bluesky-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "startUrls": {
                        "title": "Start URLs",
                        "type": "array",
                        "description": "Bluesky profile URLs to scrape posts from. Example: https://bsky.app/profile/bsky.app",
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "handles": {
                        "title": "Handles",
                        "type": "array",
                        "description": "Bluesky handles to scrape posts from (e.g. bsky.app, jay.bsky.team).",
                        "items": {
                            "type": "string"
                        }
                    },
                    "searchTerms": {
                        "title": "Search Terms",
                        "type": "array",
                        "description": "Search queries to find posts on Bluesky. The actor will search and scrape matching posts.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxItems": {
                        "title": "Max Items",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Maximum number of posts to scrape. Use 0 or leave empty for unlimited.",
                        "default": 100
                    },
                    "maxConcurrency": {
                        "title": "Max Concurrency",
                        "minimum": 1,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Maximum number of requests processed in parallel.",
                        "default": 10
                    },
                    "debugMode": {
                        "title": "Debug Mode",
                        "type": "boolean",
                        "description": "Enables verbose logging for debugging.",
                        "default": false
                    },
                    "extendOutputFunction": {
                        "title": "Extend Output Function",
                        "type": "string",
                        "description": "A JavaScript function to customize each output item. Receives { data }."
                    },
                    "proxyConfig": {
                        "title": "Proxy Configuration",
                        "type": "object",
                        "description": "Select proxies to be used."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
