Bluesky Scraper
Pricing
Pay per usage
Bluesky Scraper
Scrape Bluesky (bsky.app) posts, profiles, and search results using the public AT Protocol API. No authentication required.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
George Kioko
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
13 hours ago
Last modified
Categories
Share
Bluesky Scraper - Extract Posts, Profiles & Search Data from bsky.app
A fast, reliable Bluesky scraper built on the public AT Protocol API. Extract posts, user profiles, and search results from bsky.app without any authentication or browser automation. Just point it at the data you need and go.
The scraper handles pagination, rate limiting, and retries automatically -- so you get clean, structured JSON every time. Whether you're tracking a hashtag, monitoring a competitor, or building a dataset for research, this tool does the heavy lifting.
Key Features
- Search posts by keyword, hashtag, or topic across all of Bluesky
- Scrape user feeds -- get every post from a specific handle
- Extract full profiles with follower counts, bios, and metadata
- Search for users/actors matching your query
- No authentication needed -- uses Bluesky's public AT Protocol API
- Automatic pagination -- fetches up to 10,000 items per query
- Smart rate limit handling with exponential backoff and retries
- Rich post data including likes, reposts, replies, hashtags, images, and direct web URLs
- Pay only for what you scrape -- $0.003 per item
How It Works
flowchart LRA["Your Input\n(keywords, handles)"] --> B["Bluesky Scraper"]B --> C{"Scrape Type?"}C -->|posts| D["app.bsky.feed.searchPosts\napp.bsky.feed.getAuthorFeed"]C -->|profiles| E["app.bsky.actor.getProfile\napp.bsky.feed.getAuthorFeed"]C -->|search| F["app.bsky.actor.searchActors"]D --> G["AT Protocol\nPublic API"]E --> GF --> GG --> H["Parse & Structure\nJSON Data"]H --> I["Apify Dataset\n(JSON, CSV, Excel)"]
Input Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
scrapeType | string | Yes | "posts" | What to scrape: posts, profiles, or search |
searchTerms | string[] | No | [] | Keywords to search for (e.g., ["artificial intelligence", "web scraping"]) |
userHandles | string[] | No | [] | Bluesky handles to scrape (e.g., ["jay.bsky.team", "pfrazee.com"]) |
maxResults | integer | No | 100 | Max items per search term or handle (1 -- 10,000) |
Scrape Type Behavior
| Mode | searchTerms | userHandles | What You Get |
|---|---|---|---|
posts | Searches posts matching keywords | Fetches the user's feed | Post objects with engagement metrics |
profiles | Searches for matching users | Gets full profile + their feed | Profile objects + post objects |
search | Searches for matching users | -- | Profile objects only |
Example Input
{"scrapeType": "posts","searchTerms": ["bluesky api", "decentralized social"],"userHandles": ["jay.bsky.team"],"maxResults": 50}
Output Data
Post Output
Each scraped post includes engagement metrics, hashtags, images, and a direct link:
{"uri": "at://did:plc:abc123/app.bsky.feed.post/xyz789","cid": "bafyreig...","text": "Just shipped a new feature using the AT Protocol. The open social web is happening! #bluesky #atproto","authorHandle": "developer.bsky.social","authorDisplayName": "Dev Builder","authorAvatar": "https://cdn.bsky.app/img/avatar/...","likeCount": 42,"repostCount": 12,"replyCount": 7,"quoteCount": 3,"createdAt": "2026-03-25T14:30:00.000Z","indexedAt": "2026-03-25T14:30:01.500Z","hashtags": ["#bluesky", "#atproto"],"images": [{"alt": "Screenshot of the new feature","thumb": "https://cdn.bsky.app/img/feed_thumbnail/...","fullsize": "https://cdn.bsky.app/img/feed_fullsize/..."}],"langs": ["en"],"webUrl": "https://bsky.app/profile/developer.bsky.social/post/xyz789"}
Profile Output
Profile data includes follower/following counts, bio, and account metadata:
{"did": "did:plc:abc123","handle": "jay.bsky.team","displayName": "Jay Graber","description": "CEO of Bluesky. Building the open social web.","avatar": "https://cdn.bsky.app/img/avatar/...","banner": "https://cdn.bsky.app/img/banner/...","followersCount": 285000,"followsCount": 1200,"postsCount": 4500,"indexedAt": "2026-03-25T10:00:00.000Z","createdAt": "2023-04-01T00:00:00.000Z","labels": [],"webUrl": "https://bsky.app/profile/jay.bsky.team"}
Use Cases
- Brand monitoring -- Track mentions of your company, product, or competitors on Bluesky in real time
- Market research -- Analyze sentiment and trends around topics in the growing Bluesky community
- Journalism -- Gather public statements and posts from newsworthy accounts for reporting
- Academic research -- Build datasets of social media discourse for NLP, network analysis, or sociological studies
- Influencer discovery -- Find and evaluate Bluesky creators by follower count, engagement, and posting frequency
- Content strategy -- Study what topics and formats perform best on the platform
- Competitive intelligence -- Monitor what your industry peers are saying and how their audiences respond
Pricing
This actor uses pay-per-event pricing. You only pay for what you scrape:
| Event | Cost |
|---|---|
item-scraped | $0.003 per item |
A typical run scraping 100 posts costs about $0.30. No monthly fees, no subscriptions -- just pay for results.
How to Run
Via Apify Console
- Go to the Bluesky Scraper page on Apify Store
- Click Start (or Try for free)
- Fill in your search terms, handles, and scrape type
- Hit Run and download your data as JSON, CSV, or Excel
Via Apify API
curl -X POST "https://api.apify.com/v2/acts/george.the.developer~bluesky-scraper/runs?token=YOUR_API_TOKEN" \-H "Content-Type: application/json" \-d '{"scrapeType": "posts","searchTerms": ["bluesky analytics"],"maxResults": 200}'
Retrieve results once the run finishes:
$curl "https://api.apify.com/v2/acts/george.the.developer~bluesky-scraper/runs/last/dataset/items?token=YOUR_API_TOKEN"
Via Apify CLI
# Install the Apify CLInpm install -g apify-cli# Run the actorapify call george.the.developer/bluesky-scraper -i '{"scrapeType": "profiles","userHandles": ["jay.bsky.team", "pfrazee.com"],"maxResults": 50}'
Via Apify JavaScript Client
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });const run = await client.actor('george.the.developer/bluesky-scraper').call({scrapeType: 'posts',searchTerms: ['decentralized social media'],maxResults: 100,});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(items);
Via Apify Python Client
from apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")run = client.actor("george.the.developer/bluesky-scraper").call(run_input={"scrapeType": "posts","searchTerms": ["bluesky data extraction"],"maxResults": 100,})items = list(client.dataset(run["defaultDatasetId"]).iterate_items())print(f"Scraped {len(items)} posts")
Limitations
- Public data only -- This scraper uses the public AT Protocol API. It cannot access private/blocked accounts or DMs.
- Rate limits -- Bluesky's API enforces rate limits. The scraper handles 429 responses with automatic backoff, but extremely large scrapes may take longer.
- No authentication -- Because no login is required, the scraper is limited to endpoints available via the public
api.bsky.appXRPC interface. - Max 10,000 items per query -- Each search term or user handle is capped at 10,000 results per run.
- Search result ordering -- Results are returned in the order provided by Bluesky's search API (relevance-based). Custom sorting is not available.
FAQ
Do I need a Bluesky account to use this scraper?
No. This scraper uses the public AT Protocol API, which requires no authentication. You don't need a Bluesky account, API key, or any credentials. Just provide your search terms or handles and run it.
What is the AT Protocol and why does it matter?
The AT Protocol (Authenticated Transfer Protocol) is the open, decentralized protocol that powers Bluesky. Because it's designed to be open, much of Bluesky's data is publicly accessible through standardized API endpoints -- which is what this scraper leverages. No reverse engineering or browser automation needed.
Can I scrape posts from a specific date range?
Currently, the scraper returns results in the order provided by Bluesky's search API. Date filtering is not directly supported as an input parameter, but you can filter results by the createdAt field after scraping. The search API tends to return recent and relevant posts first.
How does this compare to using the Bluesky API directly?
This scraper wraps the raw AT Protocol API with automatic pagination, rate limit handling (with exponential backoff), retry logic for server errors, and clean data parsing. Instead of writing boilerplate code to handle cursors, HTTP errors, and data normalization, you get structured JSON output ready for analysis. It also runs on Apify's infrastructure, so you don't need to manage servers or worry about IP blocks.
Is web scraping Bluesky legal?
This tool accesses Bluesky's public API -- the same endpoints any developer can call. It collects only publicly available data. As with any data collection, you should comply with applicable laws (GDPR, CCPA) and Bluesky's Terms of Service. This tool is intended for legitimate use cases like research, journalism, and brand monitoring.
Built with the Apify SDK and the AT Protocol public API.