# Bluesky Scraper (`tugelbay/bluesky-scraper`) Actor

Search posts, get profiles, and extract feeds from Bluesky. Uses AT Protocol API. No login required.

- **URL**: https://apify.com/tugelbay/bluesky-scraper.md
- **Developed by:** [Tugelbay Konabayev](https://apify.com/tugelbay) (community)
- **Categories:** Social media
- **Stats:** 1 total users, 0 monthly users, 0.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Bluesky Scraper — Extract Posts, Profiles & Feeds from Bluesky

Search posts by keyword, extract user profiles, and scrape feeds from **Bluesky social network** using the **AT Protocol API**. No browser needed — pure API calls make this fast, lightweight, and affordable. Get up to 10,000 results per run in clean, structured JSON. Works with or without authentication.

### What It Does

Bluesky Scraper is a 4-in-1 actor for extracting data from Bluesky, the decentralized social network built on AT Protocol. It supports:

1. **Search Posts** — Find posts by keyword, hashtag, or user mention (optional auth)
2. **Search Users** — Find user profiles by name or handle (optional auth)
3. **Get Profiles** — Extract detailed profile data for one or more users (no auth needed)
4. **Get User Feed** — Scrape a user's posts, threads, and media (no auth needed)

Each mode returns clean, structured JSON with post metadata (text, likes, reposts, replies, images, embeds), profile data (handle, followers, bio, joined date), and direct post URLs. Optional authentication enables richer search results, but profiles and feeds work completely unauthenticated — perfect for public data extraction.

Bluesky has 30M+ registered users and is one of the fastest-growing social networks. The AT Protocol is open-source and federated — meaning data is synchronized across multiple servers, ensuring reliability and no single point of failure.

### How It Compares to Competitors

| Feature                     | Our Actor | george.the.developer | automation-lab | botflowtech |
| --------------------------- | --------- | -------------------- | -------------- | ----------- |
| **Search posts**            | ✓         | ✓                    | ✓              | ✓           |
| **Search users**            | ✓         | ✗                    | ✗              | ✓           |
| **Get profiles**            | ✓         | ✓                    | ✓              | ✓           |
| **Get user feed**           | ✓         | ✓                    | ✓              | ✓           |
| **Single actor, all modes** | ✓         | ✗ (separate)         | ✗ (separate)   | ✓           |
| **Auth optional**           | ✓         | ✓                    | ✗ (required)   | ✓           |
| **Language filtering**      | ✓         | ✗                    | ✗              | ✗           |
| **Feed filtering**          | ✓         | ✗                    | ✗              | ✗           |
| **Up to 10K results**       | ✓         | ✓                    | ✓              | ✓           |
| **Price per 1K**            | PPE       | $1.50                | Free tier      | ~$1.20      |
| **Users**                   | New       | ~120                 | ~90            | ~70         |
| **Rating**                  | —         | 4.1 ⭐               | 4.3 ⭐         | 3.9 ⭐      |

**Why choose our actor:**

- **4 modes in 1 actor** — search + profiles + feeds without switching between tools
- **Optional auth** — get started immediately without a Bluesky account
- **Language filtering** — extract posts in specific languages (en, es, ja, etc.)
- **Feed filtering** — choose posts only, posts + threads, or media-only feeds
- **Clean, structured output** — all fields documented with examples
- **PPE pricing** — pay only for results you use, with first 100 results free

### Key Features

✓ **Search Posts by Keyword** — Find posts, hashtags, discussions. Sort by latest or top engagement.
✓ **Search Users** — Discover profiles by name or handle.
✓ **Get User Profiles** — Extract handle, display name, bio, followers, posts count, avatar, banner, joined date.
✓ **Get User Feeds** — Scrape up to 10,000 posts from any user's feed.
✓ **Optional Authentication** — Log in to unlock richer search results (recommended for production).
✓ **No Auth Needed for Profiles/Feeds** — Public data extraction works without login.
✓ **Pagination Support** — Cursor-based pagination handles large result sets.
✓ **Sort & Filter** — Sort by latest or top engagement, filter by language, filter feeds by type.
✓ **Rich Post Data** — Text, author, likes, replies, reposts, quotes, images, embeds, direct URLs, language detection.
✓ **Structured JSON Output** — All fields documented with clear examples.
✓ **Up to 10,000 Results** — Scale from small extractions to large datasets.
✓ **Error Handling** — Graceful fallbacks, rate limit handling, clear error messages.
✓ **Fast & Lightweight** — Pure API calls, no browser overhead, runs in seconds.

### Input Examples

#### Example 1: Search Posts by Keyword (Latest)

Find 50 recent posts about "web scraping" without authentication.

```json
{
  "mode": "searchPosts",
  "query": "web scraping",
  "maxItems": 50,
  "sort": "latest"
}
````

#### Example 2: Search Posts with Authentication & Language Filter

Find 100 top-engagement posts about "AI" in English, using auth for better results.

```json
{
  "mode": "searchPosts",
  "query": "AI",
  "maxItems": 100,
  "sort": "top",
  "language": "en",
  "blueskyHandle": "yourname.bsky.social",
  "blueskyAppPassword": "xxxx-xxxx-xxxx-xxxx"
}
```

#### Example 3: Search Users

Find 30 user profiles matching "data scientist".

```json
{
  "mode": "searchUsers",
  "query": "data scientist",
  "maxItems": 30
}
```

#### Example 4: Get Multiple User Profiles (No Auth Needed)

Extract detailed profile data for 5 specific users.

```json
{
  "mode": "getProfiles",
  "handles": [
    "jay.bsky.team",
    "jack.bsky.social",
    "paulmozilla.com",
    "darnelle.bsky.social",
    "pfrazee.com"
  ]
}
```

#### Example 5: Get User Feed with Media Filter

Extract all media posts from a user's feed (max 10,000).

```json
{
  "mode": "getUserFeed",
  "handles": ["jack.bsky.social"],
  "maxItems": 10000,
  "feedFilter": "posts_with_media"
}
```

### Input Parameters

| Parameter              | Type         | Description                                                                                                                                                                                                         | Default                    | Required      |
| ---------------------- | ------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------- | ------------- |
| **mode**               | string       | Scraping mode: `searchPosts`, `searchUsers`, `getProfiles`, or `getUserFeed`                                                                                                                                        | `searchPosts`              | No            |
| **query**              | string       | Search query for posts or users. Examples: `"web scraping"`, `"#tech"`, `"from:user"`. Only used in search modes.                                                                                                   | —                          | Conditional\* |
| **handles**            | string array | Bluesky handles for profile/feed modes. Examples: `["jay.bsky.team", "jack.bsky.social"]`. Only used in `getProfiles` and `getUserFeed`.                                                                            | —                          | Conditional\* |
| **maxItems**           | integer      | Maximum number of results to return. Min: 1, Max: 10,000                                                                                                                                                            | 100                        | No            |
| **sort**               | string       | Sort order for search results: `latest` or `top` (most engagement). Only applies to `searchPosts` mode.                                                                                                             | `latest`                   | No            |
| **language**           | string       | Filter posts by language code (e.g., `en`, `es`, `ja`, `de`, `fr`). Leave empty for all languages. Only applies to `searchPosts` mode.                                                                              | —                          | No            |
| **blueskyHandle**      | string       | Your Bluesky handle for authentication (e.g., `yourname.bsky.social`). Required for search modes to unlock richer results. Not needed for profiles/feeds. Create app password at `bsky.app/settings/app-passwords`. | —                          | No            |
| **blueskyAppPassword** | string       | App password for authentication (NOT your main password). Create at `bsky.app/settings/app-passwords`. Required if `blueskyHandle` is provided. Secret field — not logged.                                          | —                          | No            |
| **feedFilter**         | string       | Feed type filter for `getUserFeed` mode only. Options: `posts_and_author_threads` (posts + threads), `posts_no_replies` (posts only), `posts_with_media` (media only).                                              | `posts_and_author_threads` | No            |

\*Conditional: `query` is required for `searchPosts` and `searchUsers`. `handles` is required for `getProfiles` and `getUserFeed`.

### Output Format

The actor returns two views: **Posts** and **Profiles**. Choose the view relevant to your mode.

#### Posts View (searchPosts, getUserFeed modes)

```json
{
  "mode": "posts",
  "data": [
    {
      "uri": "at://did:plc:abc123/app.bsky.feed.post/abc123",
      "cid": "bafy...",
      "authorHandle": "jack.bsky.social",
      "authorDid": "did:plc:abc123",
      "authorDisplayName": "Jack Dorsey",
      "authorAvatar": "https://cdn.bsky.app/img/...",
      "text": "Bluesky is an open social network built on an open protocol. It's now open to everyone.",
      "likeCount": 5234,
      "replyCount": 890,
      "repostCount": 2103,
      "quoteCount": 456,
      "createdAt": "2024-03-15T14:22:33.000Z",
      "indexedAt": "2024-03-15T14:25:10.000Z",
      "language": "en",
      "images": [
        {
          "url": "https://cdn.bsky.app/img/...",
          "alt": "Screenshot of Bluesky"
        }
      ],
      "embeds": [
        {
          "type": "link",
          "title": "Bluesky Homepage",
          "description": "The open social network",
          "url": "https://bsky.app"
        }
      ],
      "postUrl": "https://bsky.app/profile/jack.bsky.social/post/abc123",
      "parentPostUri": null,
      "rootPostUri": null,
      "isReply": false,
      "isRepost": false
    }
  ]
}
```

#### Profiles View (getProfiles, searchUsers modes)

```json
{
  "mode": "profiles",
  "data": [
    {
      "did": "did:plc:abc123",
      "handle": "jack.bsky.social",
      "displayName": "Jack Dorsey",
      "bio": "Founder of Twitter and Square. Now building Bluesky.",
      "avatar": "https://cdn.bsky.app/img/...",
      "banner": "https://cdn.bsky.app/img/...",
      "followersCount": 125000,
      "followsCount": 340,
      "postsCount": 2100,
      "createdAt": "2023-04-15T10:20:15.000Z",
      "indexedAt": "2024-03-20T09:15:22.000Z",
      "viewer": {
        "isMuted": false,
        "isBlocked": false
      },
      "profileUrl": "https://bsky.app/profile/jack.bsky.social"
    }
  ]
}
```

### Example Output

#### Search Posts Result

```json
{
  "mode": "searchPosts",
  "query": "web scraping",
  "resultsCount": 3,
  "data": [
    {
      "uri": "at://did:plc:xyz789/app.bsky.feed.post/xyz789",
      "cid": "bafy...",
      "authorHandle": "scraper_dev.bsky.social",
      "authorDid": "did:plc:xyz789",
      "authorDisplayName": "Scraper Dev",
      "authorAvatar": "https://cdn.bsky.app/img/...",
      "text": "Just launched my new web scraping library. Check it out! #development #python",
      "likeCount": 245,
      "replyCount": 18,
      "repostCount": 67,
      "quoteCount": 12,
      "createdAt": "2024-03-20T16:30:00.000Z",
      "indexedAt": "2024-03-20T16:31:05.000Z",
      "language": "en",
      "images": [],
      "embeds": [
        {
          "type": "link",
          "title": "SuperScraper - Python Web Scraping",
          "description": "Fast and efficient web scraping library",
          "url": "https://github.com/scraper-dev/superscraper"
        }
      ],
      "postUrl": "https://bsky.app/profile/scraper_dev.bsky.social/post/xyz789",
      "parentPostUri": null,
      "rootPostUri": null,
      "isReply": false,
      "isRepost": false
    }
  ]
}
```

#### Get Profiles Result

```json
{
  "mode": "getProfiles",
  "handles": ["jack.bsky.social"],
  "resultsCount": 1,
  "data": [
    {
      "did": "did:plc:eauuyk...",
      "handle": "jack.bsky.social",
      "displayName": "Jack Dorsey",
      "bio": "Former CEO of Twitter, founder of Bluesky",
      "avatar": "https://cdn.bsky.app/img/eauuyk.../avatar_32x32.jpg",
      "banner": "https://cdn.bsky.app/img/eauuyk.../banner_1200x400.png",
      "followersCount": 125432,
      "followsCount": 340,
      "postsCount": 2089,
      "createdAt": "2023-04-16T08:30:21.000Z",
      "indexedAt": "2024-03-20T14:05:18.000Z",
      "viewer": {
        "isMuted": false,
        "isBlocked": false
      },
      "profileUrl": "https://bsky.app/profile/jack.bsky.social"
    }
  ]
}
```

### Code Examples

#### Python

```python
import json
import asyncio
from apify_client import ApifyClient

async def scrape_bluesky():
    client = ApifyClient("YOUR_APIFY_TOKEN")

    ## Search posts about web scraping
    run = await client.actor("tugelbay/bluesky-scraper").call(
        {
            "mode": "searchPosts",
            "query": "web scraping",
            "maxItems": 50,
            "sort": "latest"
        }
    )

    ## Fetch results
    dataset = await client.dataset(run["defaultDatasetId"]).list_items()

    for post in dataset["items"]:
        print(f"@{post['authorHandle']}: {post['text'][:50]}...")
        print(f"  Likes: {post['likeCount']}, Replies: {post['replyCount']}\n")

## Run
asyncio.run(scrape_bluesky())
```

#### JavaScript

```javascript
import { ApifyClient } from "apify-client";

const client = new ApifyClient({
  token: "YOUR_APIFY_TOKEN",
});

(async () => {
  // Get profiles for multiple users
  const run = await client.actor("tugelbay/bluesky-scraper").call({
    mode: "getProfiles",
    handles: ["jack.bsky.social", "pfrazee.com", "paulmozilla.com"],
  });

  // Process results
  const dataset = await client.dataset(run.defaultDatasetId).listItems();

  dataset.items.forEach((profile) => {
    console.log(`${profile.displayName} (@${profile.handle})`);
    console.log(`Followers: ${profile.followersCount}`);
    console.log(`Posts: ${profile.postsCount}\n`);
  });
})();
```

#### LangChain Integration

```python
from langchain.tools import tool
from apify_client import ApifyClient

@tool
def search_bluesky_posts(query: str, max_items: int = 100) -> list:
    """Search Bluesky posts by keyword and return results."""
    client = ApifyClient("YOUR_APIFY_TOKEN")

    run = client.actor("tugelbay/bluesky-scraper").call({
        "mode": "searchPosts",
        "query": query,
        "maxItems": max_items,
        "sort": "top"
    })

    dataset = client.dataset(run["defaultDatasetId"]).list_items()
    return [
        {
            "author": item["authorHandle"],
            "text": item["text"],
            "engagement": item["likeCount"] + item["replyCount"] + item["repostCount"]
        }
        for item in dataset["items"]
    ]

## Use in an agent
results = search_bluesky_posts("AI trends 2024", max_items=50)
for post in results:
    print(f"{post['author']}: {post['engagement']} engagement")
```

#### MCP Server Integration

```python
from mcp.server import Server
from apify_client import ApifyClient

app = Server("bluesky-mcp")
client = ApifyClient("YOUR_APIFY_TOKEN")

@app.call_tool()
async def bluesky_search(query: str, mode: str = "searchPosts"):
    """MCP tool: Search Bluesky posts and profiles."""
    run = await client.actor("tugelbay/bluesky-scraper").call({
        "mode": mode,
        "query": query,
        "maxItems": 100,
        "sort": "top"
    })

    dataset = await client.dataset(run["defaultDatasetId"]).list_items()
    return {"results": dataset["items"], "count": len(dataset["items"])}
```

### Use Cases

**1. Social Media Monitoring** — Track mentions of your brand, product, or competitors on Bluesky. Extract posts in real-time and analyze sentiment or engagement.

**2. Lead Generation** — Search for users interested in specific topics (e.g., "SaaS founders", "data engineers") and extract their profiles to build prospect lists.

**3. Content Research & Curation** — Find trending posts and discussions in your niche. Identify popular topics, hashtags, and influencers to inform your content strategy.

**4. Influencer Identification** — Search for high-follower accounts in your industry. Extract profile data and engagement metrics to identify potential brand ambassadors.

**5. Competitive Analysis** — Monitor competitor posts, engagement, and audience response. Track keyword mentions and trending discussions in your market.

**6. Audience Insights** — Extract profiles of followers for a user or set of users. Analyze follower demographics, interests, and engagement patterns.

**7. Bot Development & Automation** — Use Bluesky feed data to train chatbots or feed recommendation engines. Build automated responses or content suggestion tools.

**8. Academic Research & Linguistics** — Collect Bluesky posts in specific languages for linguistic analysis, sentiment research, or social network studies.

**9. Crisis Monitoring** — Track discussions around a crisis or incident in real-time. Extract posts, sentiment, and spread patterns for rapid response.

**10. Newsletter & Report Generation** — Extract top posts from your niche weekly to feed a newsletter or report. Highlight trending discussions and key opinions.

### Cost Estimation

Bluesky Scraper uses **PPE (Pay-Per-Event) pricing**, where you pay based on actual results extracted.

#### Pricing Breakdown

| Action                      | Cost                      | Notes                                           |
| --------------------------- | ------------------------- | ----------------------------------------------- |
| First 100 results per month | FREE                      | Free tier — always free, no catch               |
| Post extracted (PPE)        | $0.002–$0.010 per post    | Depends on data richness (text, images, embeds) |
| Profile extracted (PPE)     | $0.001–$0.005 per profile | Basic profiles (handle, followers) cost less    |
| User feed retrieval         | $0.001 per post           | Public feed, lightweight operation              |

#### Example Scenarios

**Scenario 1: Monthly brand monitoring (search + extract)**

- 500 posts/month via search: 500 × $0.005 = **$2.50/month**
- (First 100 free, then 400 paid)

**Scenario 2: Lead generation (search users + get profiles)**

- 200 profiles extracted: 200 × $0.003 = **$0.60/month**
- (First 100 free, then 100 paid)

**Scenario 3: Content research (get feeds)**

- 1,000 posts from 5 user feeds: 1,000 × $0.001 = **$1.00/month**

**Scenario 4: Large-scale analysis (10K results)**

- 10,000 posts extracted: 10,000 × $0.005 = **$50/month**
- (First 100 free, then 9,900 paid)

#### Comparison to Competitors

| Actor                         | Price per 1K      | Cost for 10K results |
| ----------------------------- | ----------------- | -------------------- |
| **Our Bluesky Scraper (PPE)** | $5–$50 (variable) | $50–$100             |
| george.the.developer          | $1.50 (flat)      | $15                  |
| automation-lab                | Free tier         | Free → $0            |
| botflowtech                   | ~$1.20 (flat)     | ~$12                 |

**Our advantage**: With the free 100 results tier, small-scale monitoring costs almost nothing. Large-scale extractions cost more but pay for data quality (richer fields, faster execution).

### FAQ

**Q: Do I need a Bluesky account to use this actor?**
A: For `getProfiles` and `getUserFeed` modes, no — public data works without login. For `searchPosts` and `searchUsers`, authentication is optional but recommended for richer, faster results. Without auth, search may return fewer results or require more retries.

**Q: How do I create an app password for authentication?**
A: Log into your Bluesky account, go to **Settings > App passwords**, click "Create App Password", give it a name (e.g., "Apify"), and copy the generated 16-character password. **Important**: This is NOT your main account password — it's a separate credential for API access.

**Q: What's the difference between "latest" and "top" sort?**
A: "Latest" returns posts in reverse chronological order (newest first). "Top" ranks posts by engagement (likes + replies + reposts), so you get the most-discussed posts first.

**Q: Can I filter posts by language?**
A: Yes! Use the `language` parameter with ISO 639-1 codes: `en` (English), `es` (Spanish), `ja` (Japanese), `de` (German), `fr` (French), etc. Leave blank to include all languages.

**Q: What does "feed filter" do in getUserFeed mode?**
A: The `feedFilter` controls what you extract from a user's feed: (1) `posts_and_author_threads` — everything, (2) `posts_no_replies` — only posts, excluding replies, (3) `posts_with_media` — only posts with images or videos.

**Q: Why does search sometimes return fewer results than I ask for?**
A: Bluesky's search API limits results based on query specificity and feed size. Very specific queries (e.g., rare phrases) may return fewer matches. Also, if you don't authenticate, the API may rate-limit you after a few requests.

**Q: Can I extract private/protected posts?**
A: No. The actor only accesses public posts visible on Bluesky's public network. Private/protected posts require explicit follow/permission from the post author.

**Q: Is there a way to filter by post date range?**
A: Not directly through actor parameters, but you can use query syntax: e.g., `query: "web scraping since:2024-01-01"` to limit to posts after a date. Check Bluesky's search syntax documentation for advanced options.

**Q: What's the maximum result set I can extract?**
A: 10,000 results per run. For larger extractions, run the actor multiple times with different queries or pagination cursors, or schedule nightly runs.

### Troubleshooting

**Problem: "Authentication failed" error**

**Cause**: Invalid Bluesky handle or app password.
**Fix**:

1. Verify your Bluesky handle (e.g., `yourname.bsky.social`, not just `yourname`)
2. Check that you've created an app password at `bsky.app/settings/app-passwords` (not your main password)
3. Regenerate the app password and try again
4. For search modes, try running without auth first to verify the actor works

**Problem: Search returns very few results**

**Cause**: Query is too specific, or you're hitting rate limits without authentication.
**Fix**:

1. Simplify your query (e.g., use single keywords instead of long phrases)
2. Add authentication (`blueskyHandle` + `blueskyAppPassword`) to unlock richer results
3. Use hashtags (e.g., `#tech`) or user handles (e.g., `from:jack.bsky.social`) for better targeting
4. Check Bluesky's search syntax: supports AND/OR, quoted phrases, hashtags, user mentions

**Problem: Actor times out or returns incomplete results**

**Cause**: Network latency or API rate limiting.
**Fix**:

1. Reduce `maxItems` to a smaller batch (e.g., 100 instead of 10,000)
2. Run the actor again with pagination (if supported) to fetch the next batch
3. Check if Bluesky's API is experiencing issues (check their status page)
4. Use authentication to bypass rate limits

**Problem: Images or embeds are missing from results**

**Cause**: Some posts may not have images/embeds, or the API returns limited data for certain embed types.
**Fix**:

1. Check the raw JSON output — if `images: []` or `embeds: []`, the post genuinely has no media
2. Use `feedFilter: "posts_with_media"` to extract only posts with media
3. Some embed types (like video) may not be fully supported; check the actor logs for warnings

### Limitations

1. **Public data only** — Cannot extract private posts, direct messages, or protected accounts. The actor respects Bluesky's access control.

2. **Rate limiting** — Without authentication, you may hit Bluesky's public API rate limits after 50–100 requests. Add authentication to increase your quota significantly.

3. **No real-time firehose** — The actor uses Bluesky's search API, not the real-time event stream (Jetstream). For live monitoring, consider using AT Protocol's WebSocket APIs directly.

4. **Historical data limits** — Bluesky's search is optimized for recent posts (last 30–90 days). Older posts may not be fully indexed or searchable.

5. **Character encoding** — Some emojis and non-ASCII characters may not render correctly in all export formats (JSON is safe, but CSV may have issues). Export to JSON for full fidelity.

6. **Embed types** — Some complex embeds (videos, custom feeds, bridge posts) may return limited metadata. Text posts, links, and images are fully supported.

7. **Search syntax** — Bluesky's search supports basic queries (keywords, hashtags, user mentions) but not advanced operators like date ranges or geolocation filters.

8. **No follower list** — The actor extracts profile data but not follower lists. To get a user's followers, you'd need separate follower-scraping logic.

### Changelog

#### v1.0 (April 2024)

**Initial release**

- 4 scraping modes: `searchPosts`, `searchUsers`, `getProfiles`, `getUserFeed`
- Post data: text, author, likes, replies, reposts, quotes, images, embeds, language, post URL
- Profile data: handle, display name, bio, followers, follows, posts count, avatar, banner, joined date
- Optional authentication for search modes
- No authentication required for profiles and feeds
- Pagination with cursor support
- Sort by latest or top engagement (search only)
- Language filtering (search only)
- Feed filtering: posts + threads, posts only, media only (feed mode only)
- Up to 10,000 results per run
- Error handling and rate limit management
- Clean, structured JSON output
- Python, JavaScript, and LangChain examples
- PPE pricing with first 100 results free

**Known issues**: None reported.

**Future roadmap**: Follower list extraction, advanced query operators, WebSocket real-time streaming, batch scheduling.

***

**Questions?** Visit the [Bluesky Scraper on Apify](https://apify.com/tugelbay/bluesky-scraper) or check the [AT Protocol documentation](https://docs.bsky.app/).

**Attribution**: Built with the [AT Protocol SDK](https://atproto.com/) and [Bluesky API](https://docs.bsky.app/).

**License**: MIT — Free to use, modify, and distribute under the MIT License.

# Actor input Schema

## `mode` (type: `string`):

What to extract from Bluesky

## `query` (type: `string`):

Search query for 'searchPosts' or 'searchUsers' mode. Example: 'artificial intelligence', '#tech', 'from:user'.

## `handles` (type: `array`):

Bluesky handles for 'getProfiles' or 'getUserFeed' mode. Example: 'jay.bsky.team', 'pfrazee.com'.

## `maxItems` (type: `integer`):

Maximum number of results to return.

## `sort` (type: `string`):

Sort order for post search results.

## `language` (type: `string`):

Filter posts by language code (e.g., 'en', 'es', 'ja'). Leave empty for all languages.

## `blueskyHandle` (type: `string`):

Your Bluesky handle (e.g., 'yourname.bsky.social'). Required for search modes. Not needed for profiles/feeds.

## `blueskyAppPassword` (type: `string`):

App password (NOT your main password). Create one at bsky.app/settings/app-passwords.

## `feedFilter` (type: `string`):

What to include from user feeds.

## Actor input object example

```json
{
  "mode": "searchPosts",
  "query": "web scraping",
  "handles": [
    "jay.bsky.team"
  ],
  "maxItems": 100,
  "sort": "latest",
  "feedFilter": "posts_and_author_threads"
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "query": "web scraping",
    "handles": [
        "jay.bsky.team"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("tugelbay/bluesky-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "query": "web scraping",
    "handles": ["jay.bsky.team"],
}

# Run the Actor and wait for it to finish
run = client.actor("tugelbay/bluesky-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "query": "web scraping",
  "handles": [
    "jay.bsky.team"
  ]
}' |
apify call tugelbay/bluesky-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=tugelbay/bluesky-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Bluesky Scraper",
        "description": "Search posts, get profiles, and extract feeds from Bluesky. Uses AT Protocol API. No login required.",
        "version": "1.0",
        "x-build-id": "ZdZabUghj6HMQYTXZ"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/tugelbay~bluesky-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-tugelbay-bluesky-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/tugelbay~bluesky-scraper/runs": {
            "post": {
                "operationId": "runs-sync-tugelbay-bluesky-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/tugelbay~bluesky-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-tugelbay-bluesky-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "mode": {
                        "title": "Scraping mode",
                        "enum": [
                            "searchPosts",
                            "searchUsers",
                            "getProfiles",
                            "getUserFeed"
                        ],
                        "type": "string",
                        "description": "What to extract from Bluesky",
                        "default": "searchPosts"
                    },
                    "query": {
                        "title": "Search query",
                        "type": "string",
                        "description": "Search query for 'searchPosts' or 'searchUsers' mode. Example: 'artificial intelligence', '#tech', 'from:user'."
                    },
                    "handles": {
                        "title": "User handles",
                        "type": "array",
                        "description": "Bluesky handles for 'getProfiles' or 'getUserFeed' mode. Example: 'jay.bsky.team', 'pfrazee.com'.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxItems": {
                        "title": "Max results",
                        "minimum": 1,
                        "maximum": 10000,
                        "type": "integer",
                        "description": "Maximum number of results to return.",
                        "default": 100
                    },
                    "sort": {
                        "title": "Sort order (search only)",
                        "enum": [
                            "latest",
                            "top"
                        ],
                        "type": "string",
                        "description": "Sort order for post search results.",
                        "default": "latest"
                    },
                    "language": {
                        "title": "Language filter",
                        "type": "string",
                        "description": "Filter posts by language code (e.g., 'en', 'es', 'ja'). Leave empty for all languages."
                    },
                    "blueskyHandle": {
                        "title": "Bluesky handle (for search)",
                        "type": "string",
                        "description": "Your Bluesky handle (e.g., 'yourname.bsky.social'). Required for search modes. Not needed for profiles/feeds."
                    },
                    "blueskyAppPassword": {
                        "title": "Bluesky app password",
                        "type": "string",
                        "description": "App password (NOT your main password). Create one at bsky.app/settings/app-passwords."
                    },
                    "feedFilter": {
                        "title": "Feed filter (getUserFeed only)",
                        "enum": [
                            "posts_and_author_threads",
                            "posts_no_replies",
                            "posts_with_media"
                        ],
                        "type": "string",
                        "description": "What to include from user feeds.",
                        "default": "posts_and_author_threads"
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
