Bluesky Scraper — Posts, Profiles & Search
Pricing
Pay per usage
Bluesky Scraper — Posts, Profiles & Search
Extracts Bluesky posts, profiles, and search results. Returns post text, like/repost counts, author details, and embedded media links.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
CryptoSignals Agent
Actor stats
0
Bookmarked
13
Total users
7
Monthly active users
19 hours ago
Last modified
Share
Bluesky Scraper — Posts, Profiles, Followers & Feeds
Extract posts, profiles, threads, followers, following lists, and custom feeds from Bluesky at scale. Uses the public AT Protocol — no API key or login needed.
Free Trial Ending April 3
This actor is free while we collect feedback. Starting April 3, 2026, it moves to $4.99/month. Add a payment method at apify.com/billing to keep access.
What Can You Scrape?
| Data Type | Description |
|---|---|
| Posts | Search by keyword, hashtag, mention, author, domain, or date range |
| Profiles | Full user profiles with follower/following counts and bio |
| Followers | Complete follower list for any handle |
| Following | Complete following list for any handle |
| Threads | Full post threads with nested replies |
| Custom Feeds | Algorithmic and user-created feeds |
Features
- Advanced search filters — date range, language, author, domain, hashtags, mentions
- Sort by latest or top relevance
- Up to 50,000 items per run
- Full engagement metrics — likes, reposts, replies, quotes, bookmark count, engagement score
- Author enrichment — optional follower/following/post counts per author
- Media extraction — hasMedia flag, media count, flat media URL array
- Reply detection — isReply flag with parent and root thread URIs
- Language detection — per-post language field
- No authentication — uses the public AT Protocol API
- Robust rate limiting — 5 retries with exponential backoff, Retry-After header support
Input Parameters
| Field | Type | Default | Description |
|---|---|---|---|
searchQuery | string | — | Keyword to search, e.g. "artificial intelligence", "#startup" |
handles | string[] | [] | Bluesky handles to scrape, e.g. ["jay.bsky.team"] |
scrapeType | string | "posts" | posts, profiles, both, followers, or following |
maxItems | integer | 100 | Max results per query/handle (1–50,000) |
sort | string | "latest" | latest or top |
since | string | — | Start date (ISO format, e.g. "2026-03-01T00:00:00Z") |
until | string | — | End date (ISO format) |
lang | string | — | Language code filter, e.g. "en", "fr" |
tag | string[] | [] | Hashtag filters |
mentions | string | — | Filter posts mentioning this handle |
author | string | — | Filter posts by author handle |
domain | string | — | Filter posts linking to this domain |
threadUris | string[] | [] | AT Protocol URIs of threads to scrape |
feedUris | string[] | [] | Custom feed URIs to scrape |
threadDepth | integer | 6 | Max depth when scraping threads |
resolveAuthors | boolean | false | Fetch author profiles for follower/following/post counts |
Example Inputs
Search posts by keyword
{"searchQuery": "artificial intelligence","scrapeType": "posts","maxItems": 200,"sort": "top","lang": "en"}
Search with author enrichment
{"searchQuery": "web scraping","scrapeType": "posts","maxItems": 50,"resolveAuthors": true}
Scrape a user's posts and profile
{"handles": ["jay.bsky.team"],"scrapeType": "both","maxItems": 100}
Get followers of an account
{"handles": ["bsky.app"],"scrapeType": "followers","maxItems": 5000}
Output Fields — Posts
| Field | Type | Description |
|---|---|---|
uri | string | AT Protocol URI |
cid | string | Content identifier |
authorHandle | string | Author's Bluesky handle |
authorDisplayName | string | Author's display name |
authorDid | string | Author's DID |
authorAvatar | string | Author's avatar URL |
authorFollowersCount | number/null | Author's follower count (requires resolveAuthors or handle scraping) |
authorFollowingCount | number/null | Author's following count |
authorPostsCount | number/null | Author's total post count |
text | string | Post text content |
createdAt | string | Post creation timestamp (ISO 8601) |
indexedAt | string | Index timestamp |
language | string/null | Detected post language (e.g. "en", "ja") |
languages | string[] | All language tags on the post |
likeCount | number | Number of likes |
repostCount | number | Number of reposts |
replyCount | number | Number of replies |
quoteCount | number | Number of quotes |
bookmarkCount | number | Number of bookmarks |
engagementScore | number | Weighted engagement: likes + reposts×2 + replies×3 + quotes×4 |
postUrl | string | Direct URL to the post on bsky.app |
hashtags | string[] | Extracted hashtags |
mentions | string[] | Mentioned DIDs |
facetLinks | string[] | Links extracted from rich text facets |
hasMedia | boolean | Whether post contains images or videos |
mediaCount | number | Total number of images + videos |
mediaUrls | string[] | Flat array of all media URLs |
images | object[] | Detailed image data (url, thumb, alt, aspectRatio) |
videos | object[] | Detailed video data (url, thumb, alt, aspectRatio) |
links | object[] | Embedded link cards (url, title, description, thumb) |
hasEmbeddedMedia | boolean | Has images, videos, or link cards |
quotedPost | object/null | Quoted post details (uri, author, text) |
isRepost | boolean | Whether this is a repost of another post |
repostedBy | object/null | Who reposted (handle, displayName) |
isReply | boolean | Whether this post is a reply |
replyParentUri | string/null | URI of the parent post (if reply) |
replyRootUri | string/null | URI of the thread root post (if reply) |
Example Post Output
{"uri": "at://did:plc:abc.../app.bsky.feed.post/xyz...","cid": "bafyrei...","authorHandle": "alice.bsky.social","authorDisplayName": "Alice","authorDid": "did:plc:abc...","authorAvatar": "https://cdn.bsky.app/...","authorFollowersCount": 5200,"authorFollowingCount": 340,"authorPostsCount": 1820,"text": "Just shipped our new feature and the response has been incredible...","createdAt": "2026-03-25T10:30:00.000Z","indexedAt": "2026-03-25T10:30:01.000Z","language": "en","languages": ["en"],"likeCount": 45,"repostCount": 12,"replyCount": 8,"quoteCount": 3,"bookmarkCount": 2,"engagementScore": 105,"postUrl": "https://bsky.app/profile/alice.bsky.social/post/xyz...","hashtags": ["buildinpublic", "startup"],"mentions": [],"facetLinks": [],"hasMedia": true,"mediaCount": 1,"mediaUrls": ["https://cdn.bsky.app/img/..."],"images": [{"url": "https://cdn.bsky.app/img/...", "thumb": "...", "alt": "Screenshot", "aspectRatio": null}],"videos": [],"links": [],"hasEmbeddedMedia": true,"quotedPost": null,"isRepost": false,"repostedBy": null,"isReply": false,"replyParentUri": null,"replyRootUri": null}
Output Fields — Profiles
| Field | Type | Description |
|---|---|---|
did | string | Decentralized identifier |
handle | string | Bluesky handle |
displayName | string | Display name |
description | string | Bio text |
avatar | string | Avatar image URL |
banner | string | Banner image URL |
followersCount | number | Number of followers |
followsCount | number | Number of accounts followed |
postsCount | number | Total number of posts |
createdAt | string | Account creation date |
indexedAt | string | Last index date |
profileUrl | string | Direct URL to profile on bsky.app |
labels | string[] | Content labels |
associated | object | Associated data (lists, feeds, etc.) |
pinnedPost | string/null | URI of pinned post |
Code Examples
Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")run = client.actor("cryptosignals/bluesky-scraper").call(run_input={"searchQuery": "AI tools","scrapeType": "posts","maxItems": 100,"sort": "top","lang": "en","resolveAuthors": True,})for item in client.dataset(run["defaultDatasetId"]).iterate_items():print(f"@{item['authorHandle']} ({item.get('authorFollowersCount', '?')} followers): "f"{item['text'][:80]}... "f"(engagement: {item['engagementScore']})")
JavaScript
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });const run = await client.actor('cryptosignals/bluesky-scraper').call({searchQuery: 'AI tools',scrapeType: 'posts',maxItems: 100,sort: 'top',lang: 'en',resolveAuthors: true,});const { items } = await client.dataset(run.defaultDatasetId).listItems();items.forEach(item => {console.log(`@${item.authorHandle} (${item.authorFollowersCount ?? '?'} followers): ${item.text.slice(0, 80)}... (engagement: ${item.engagementScore})`);});
cURL
# Start the runcurl -X POST "https://api.apify.com/v2/acts/cryptosignals~bluesky-scraper/runs?token=YOUR_API_TOKEN" \-H "Content-Type: application/json" \-d '{"searchQuery": "AI", "scrapeType": "posts", "maxItems": 50}'# Get results (replace DATASET_ID)curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"
Use Cases
- Social listening — monitor brand mentions and sentiment on Bluesky in real-time
- Audience research — analyze follower demographics and growth for any account
- Content strategy — find trending topics and high-engagement post patterns
- Lead generation — discover people discussing topics relevant to your business
- Academic research — collect public social media data at scale for analysis
- Competitor monitoring — track what competitors post and how their audience reacts
- Influencer discovery — find accounts with high engagement in specific niches
FAQ
Do I need a Bluesky account? No. The scraper uses the public AT Protocol API. No login or API key required.
How many results can I get? Up to 50,000 per run. For larger datasets, run multiple searches with different parameters.
What is the engagement score? A weighted metric: likes + reposts×2 + replies×3 + quotes×4. Higher scores indicate more engaging posts. Use it to quickly sort and filter high-value content.
What does resolveAuthors do? When enabled, the scraper fetches full profiles for each unique author, adding follower count, following count, and post count to every post item. This makes extra API calls, so it's opt-in.
Can I get private/blocked content? No. Only publicly available data is scraped.
How fresh is the data? Real-time. Each run fetches live data from the AT Protocol API.
What export formats are supported? JSON, CSV, Excel, XML, HTML. Connect via API, webhooks, Zapier, Make, or Google Sheets.
Integrations
- REST API — trigger runs and fetch results programmatically
- Webhooks — get notified when scraping completes
- Zapier / Make — connect to 5,000+ apps
- Google Sheets — export directly to spreadsheets
- Slack / Email — set up alerts for new posts matching your keywords
Related Scrapers
- Substack Scraper — Newsletter posts and subscriber data
- Hacker News Scraper — Stories, comments and jobs
- Reddit Scraper — Posts, comments and subreddits
See all scrapers by CryptoSignals
Using proxies
Bluesky's AT Protocol API enforces per-IP rate limits that trigger HTTP 429 responses during bulk data collection. Once rate-limited, your IP can be throttled for extended periods. Residential proxies distribute requests across real ISP addresses, keeping each IP well under the rate threshold. ThorData provides 200M+ residential IPs that work well for sustained AT Protocol scraping without hitting rate walls.