Bluesky Omni Scraper
Pricing
Pay per usage
Bluesky Omni Scraper
Extract posts, profiles, threads and followers from Bluesky via the official AT Protocol API. Search by keyword or hashtag, scrape author feeds, full threads and follower lists. No browser, no login. Export to JSON, CSV or Excel.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
S. Klein
Maintained by CommunityActor stats
1
Bookmarked
2
Total users
0
Monthly active users
8 hours ago
Last modified
Categories
Share
Bluesky Scraper – Posts, Profiles, Threads & More
Extract posts, profiles, threads, followers, and custom feeds from Bluesky — the fast-growing decentralized social network built on the AT Protocol. No account required. No browser. No proxies. Pure API.
Try it now: click Try for free above.
Why Bluesky Scraper?
Most Bluesky scrapers handle one use case. This Actor handles seven — and does it faster, cheaper, and more accurately than browser-based alternatives.
| # | What makes it better | Why it matters |
|---|---|---|
| 1 | 7 scrape modes in one actor | Search, author feed, thread, profile, followers, following, and custom feed. Competitors often need separate actors per mode. |
| 2 | Direct AT Protocol API | No headless browser, no proxies, no fragile HTML parsing. Pure API calls = faster runs, lower cost, higher reliability. |
| 3 | Engagement rate auto-calculated | In authorFeed mode, follower count is pre-fetched once and engagementRate = (likes + reposts + quotes) / followers is computed per post. No spreadsheet formulas needed. |
| 4 | Structured hashtags and mentions via facets | Tags and mentions are extracted from AT Protocol facets — machine-readable structured data, not regex on raw text. Always accurate, even for posts with special characters or non-Latin scripts. |
| 5 | Custom feed generator support | Scrape any public Bluesky algorithm (What's Hot, topic feeds, community feeds) by URI. Unique to Bluesky's open protocol. |
| 6 | No login required | 100% public API. No credentials, no session tokens, no risk of account bans. |
| 7 | Stable AT URIs as post IDs | Every post gets its postId as a permanent AT URI (e.g. at://did:plc:.../...). These decentralized identifiers never break when users change their handle. |
| 8 | Smart rate-limit handling | Automatically honors Retry-After headers from the API. No wasted retries, no failed runs. |
| 9 | Full media extraction | Image URLs, alt texts, and video thumbnails are extracted and indexed per post. |
| 10 | MCP-ready | Works out of the box with Claude Desktop via Apify's MCP server. Run natural-language Bluesky queries without writing a single line of code. |
What does Bluesky Scraper do?
The Actor connects directly to the AT Protocol public AppView API (api.bsky.app) and supports seven distinct scrape modes:
| Mode | What it collects |
|---|---|
searchPosts | Posts matching a keyword, phrase, or hashtag — with optional date range and language filters |
authorFeed | All posts (and optionally reposts/replies) from one or more Bluesky accounts |
postThread | A full post thread including all nested replies |
profile | Detailed profile stats for one or more accounts |
followers | The complete follower list of an account |
follows | The complete following list of an account |
customFeed | Any public Bluesky feed generator, identified by its AT URI |
All modes support automatic cursor-based pagination to collect results beyond the first page.
Common use cases:
- Brand monitoring — track mentions of your product, company, or topic in real time
- Research and journalism — snapshot public conversations with date-range filters
- Influencer discovery — surface high-engagement accounts via follower and engagement metrics
- Academic datasets — build labeled corpora for NLP, sentiment analysis, or network studies
- Competitor intelligence — monitor what your industry is discussing on Bluesky
- Content curation — find top posts on any hashtag ranked by engagement
How to use Bluesky Scraper
- Click Try for free to open the Actor input form
- Select a Scrape mode from the dropdown (e.g.
searchPosts) - Enter your search query (e.g.
#AIorclimate change) or one or more Bluesky handles - Set Max items to control how many results to collect
- Optionally add date filters, a language filter, or a custom feed URI
- Click Start and find your structured results in the Output tab — downloadable as JSON, CSV, or Excel
Input
All fields are configured in the Input tab or via JSON.
| Field | Required | Description |
|---|---|---|
mode | Yes | One of: searchPosts, authorFeed, postThread, profile, followers, follows, customFeed |
searchQuery | For searchPosts | Keyword, phrase, or hashtag — e.g. #AI or "machine learning" |
handles | For authorFeed, profile, followers, follows | One or more Bluesky handles, e.g. ["user.bsky.social"] |
postUrl | For postThread | Full post URL or AT URI |
feedUri | For customFeed | AT URI of the feed generator, e.g. at://did:plc:z72i7hd/app.bsky.feed.generator/whats-hot |
maxItems | No | Max items to collect (default: 100, 0 = unlimited; UI maximum: 10,000) |
searchSort | No | latest or top (default: latest) |
searchSince | No | Start date filter, e.g. 2024-01-01 |
searchUntil | No | End date filter, e.g. 2024-12-31 |
searchLang | No | ISO 639-1 language code, e.g. en, de, ja |
includeReplies | No | Include replies in author feed (default: false) |
includeReposts | No | Include reposts in author feed (default: false) |
Example — hashtag search:
{"mode": "searchPosts","searchQuery": "#AI","searchSort": "latest","searchLang": "en","maxItems": 500}
Example — custom feed:
{"mode": "customFeed","feedUri": "at://did:plc:z72i7hdynmchkltzmefcsowb/app.bsky.feed.generator/whats-hot","maxItems": 100}
Output
Each item is pushed to the default Apify dataset. Download results as JSON, CSV, HTML, or Excel from the Storage tab.
Example post output item:
{"postId": "at://did:plc:abc123/app.bsky.feed.post/3kwikpostabcde","url": "https://bsky.app/profile/alice.bsky.social/post/3kwikpostabcde","text": "Really enjoying the discourse around #AI safety this week.","authorHandle": "alice.bsky.social","authorDid": "did:plc:abc123","authorDisplayName": "Alice","authorFollowerCount": 3200,"authorFollowingCount": 410,"authorPostCount": 812,"authorAvatarUrl": "https://cdn.bsky.app/img/avatar/plain/did:plc:abc123/bafkreiabcdef@jpeg","likeCount": 47,"repostCount": 12,"replyCount": 8,"quoteCount": 3,"engagementRate": 0.0192,"createdAt": "2024-11-15T14:22:30.000Z","indexedAt": "2024-11-15T14:22:31.500Z","lang": ["en"],"hasMedia": false,"mediaUrls": null,"mediaAltTexts": null,"externalUrl": null,"externalTitle": null,"inReplyToUri": null,"inReplyToUrl": null,"inReplyToHandle": null,"isRepost": false,"repostOf": null,"labels": null,"tags": ["AI"],"mentionedDids": [],"scrapedMode": "authorFeed"}
Example profile output item (modes: profile, followers, follows):
{"did": "did:plc:abc123","handle": "alice.bsky.social","displayName": "Alice","description": "Researcher. Bluesky enthusiast.","followerCount": 3200,"followingCount": 410,"postCount": 812,"avatarUrl": "https://cdn.bsky.app/img/avatar/plain/did:plc:abc123/bafkreiabcdef@jpeg","bannerUrl": null,"createdAt": "2023-05-01T10:00:00.000Z","indexedAt": "2024-01-15T08:30:00.000Z","labels": [],"scrapedMode": "profile"}
Data table
Post fields (modes: searchPosts, authorFeed, postThread, customFeed)
| Field | Type | Description |
|---|---|---|
postId | string | Permanent AT URI for the post (e.g. at://did:plc:.../app.bsky.feed.post/...) |
url | string | Public bsky.app URL |
text | string | null | Full post text |
authorHandle | string | Author's Bluesky handle |
authorDid | string | Author's decentralized identifier |
authorDisplayName | string | null | Author's display name |
authorFollowerCount | integer | null | Follower count at scrape time. null in searchPosts mode — the search API returns minimal author objects without stats. |
authorFollowingCount | integer | null | Following count. null in searchPosts mode (same reason). |
authorPostCount | integer | null | Total post count. null in searchPosts mode (same reason). |
authorAvatarUrl | string | null | Avatar image URL. null in searchPosts mode (same reason). |
likeCount | integer | Number of likes |
repostCount | integer | Number of reposts |
replyCount | integer | Number of replies |
quoteCount | integer | Number of quote posts |
engagementRate | float | null | (likes + reposts + quotes) / authorFollowerCount. Populated in authorFeed mode only; null in other modes. |
createdAt | string | null | ISO 8601 creation timestamp |
indexedAt | string | null | ISO 8601 timestamp when the post was indexed by Bluesky |
lang | string[] | null | Detected language codes |
hasMedia | boolean | true if the post contains images or video |
mediaUrls | string[] | null | Image or video thumbnail URLs |
mediaAltTexts | (string | null)[] | null | Alt text per image in the same order as mediaUrls; individual entries may be null if no alt text was provided |
externalUrl | string | null | Embedded link card URL |
externalTitle | string | null | Embedded link card title |
inReplyToUri | string | null | Parent post AT URI if this is a reply |
inReplyToUrl | string | null | Parent post public URL if this is a reply |
inReplyToHandle | string | null | Parent post author handle. null in searchPosts mode (parent author not embedded in search results). |
isRepost | boolean | true if this item is a repost |
repostOf | string | null | AT URI of the original post if isRepost is true; otherwise null |
labels | string[] | null | Moderation labels applied to the post |
tags | string[] | Hashtags extracted from AT Protocol facets (empty array if none) |
mentionedDids | string[] | DIDs of accounts mentioned in the post, extracted from facets (empty array if none) |
scrapedMode | string | The Actor mode that produced this item |
Profile fields (modes: profile, followers, follows)
| Field | Type | Description |
|---|---|---|
did | string | Decentralized identifier of the account |
handle | string | Bluesky handle (e.g. user.bsky.social) |
displayName | string | null | Account display name |
description | string | null | Bio / profile description |
followerCount | integer | null | Number of followers |
followingCount | integer | null | Number of accounts followed |
postCount | integer | null | Total post count |
avatarUrl | string | null | Profile picture URL |
bannerUrl | string | null | Banner image URL |
createdAt | string | null | Account creation timestamp |
indexedAt | string | null | Timestamp when indexed by Bluesky |
labels | string[] | null | Moderation labels applied to the account |
scrapedMode | string | The Actor mode that produced this item |
Pricing / Cost estimation
This Actor uses Apify's pay-per-event pricing. Typical costs:
- Small run (100 posts): ~$0.01–$0.05
- Medium run (1,000 posts): ~$0.10–$0.30
- Large run (10,000 posts): ~$1.00–$2.00
New Apify accounts receive a free tier with monthly compute units included. Most small research tasks run entirely for free.
Use with Claude Desktop / Cursor / VS Code (via Apify MCP)
Query Bluesky interactively from Claude Desktop without writing any code — ideal for ad-hoc research and conversational analysis. This Actor is available as an MCP tool through Apify's hosted MCP server, so everything runs in the cloud. No local setup required.
What you need
- An Apify account (free tier available)
- Your Apify API token (found under Settings → Integrations)
Setup (one-time)
Add the following to your claude_desktop_config.json:
{"mcpServers": {"apify": {"command": "npx","args": ["-y", "@apify/actors-mcp-server"],"env": {"APIFY_TOKEN": "your_apify_api_token_here"}}}}
Config file location:
- Windows:
%APPDATA%\Claude\claude_desktop_config.json - macOS:
~/Library/Application Support/Claude/claude_desktop_config.json
Restart Claude Desktop — the Bluesky Scraper will appear as an available tool in the chat (hammer icon). When prompting Claude, mention the Actor by name so it picks the right tool.
Example prompts
"Search for the 20 most recent posts about #climatepolicy in English and summarize the main arguments."
"Show me the last 10 posts from @bsky.app and tell me what topics they cover."
"Fetch the full reply thread for this post: https://bsky.app/profile/user.bsky.social/post/3abc123"
"Find the top 50 followers of @atproto.com and list their display names and follower counts."
"Scrape the What's Hot feed and tell me the top 5 trending topics right now."
Tips and advanced options
- Date range filtering (
searchSince/searchUntil) narrows search results significantly and reduces unnecessary API calls for historical research. - Language filtering (
searchLang) is especially useful for multilingual topics like#AI— useenfor English-only results. - Author feed mode with
includeReplies: false(default) produces cleaner top-level post datasets without conversation noise. - Thread mode is ideal for journalists and researchers who need the full reply context for a specific post.
- Engagement rate (
engagementRate) is only populated inauthorFeedmode because follower count requires a separate profile lookup — this is done once per handle and cached for the entire run. - Custom feed URIs can be found in the Bluesky app: open any feed, tap the three-dot menu, and copy the link. The AT URI appears in the share URL.
- Set
maxItems: 0for unlimited scraping — use with care on large accounts or popular hashtags.
FAQ, disclaimers, and support
Is scraping Bluesky legal? This Actor accesses only the public AT Protocol API — the same API used by official Bluesky clients. All data collected is publicly available. You remain responsible for complying with Bluesky's Terms of Service and applicable data protection regulations (GDPR, CCPA, etc.) in your jurisdiction.
Can I scrape private accounts or DMs? No. This Actor only accesses public data via the public AppView API. Private accounts and direct messages are not accessible.
Why are some fields null in searchPosts mode?
Bluesky's search API returns minimal author objects that do not include follower/following/post counts or avatar URLs. Fields like authorFollowerCount, authorFollowingCount, authorPostCount, and authorAvatarUrl are null for search results. Use authorFeed mode if you need full author stats.
What rate limits apply?
The AT Protocol AppView API has generous public rate limits. The Actor handles rate-limit responses (HTTP 429) automatically by honoring the Retry-After header — no run failures due to throttling.
I need a custom feature or bulk data pipeline. Open an issue on the Issues tab or contact us via the Apify platform — custom solutions are available.
Known limitations
- Search result counts may be approximate; the AT Protocol API does not guarantee exact pagination totals
- The
searchUntil/searchSincedate parameters may occasionally return slightly out-of-range results — this is known AT Protocol behavior - Video content is supported (thumbnail URL extracted) but full video file download is not included