Bluesky Omni Scraper avatar

Bluesky Omni Scraper

Pricing

Pay per usage

Go to Apify Store
Bluesky Omni Scraper

Bluesky Omni Scraper

Extract posts, profiles, threads and followers from Bluesky via the official AT Protocol API. Search by keyword or hashtag, scrape author feeds, full threads and follower lists. No browser, no login. Export to JSON, CSV or Excel.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

S. Klein

S. Klein

Maintained by Community

Actor stats

1

Bookmarked

2

Total users

0

Monthly active users

8 hours ago

Last modified

Share

Bluesky Scraper – Posts, Profiles, Threads & More

Extract posts, profiles, threads, followers, and custom feeds from Bluesky — the fast-growing decentralized social network built on the AT Protocol. No account required. No browser. No proxies. Pure API.

Try it now: click Try for free above.


Why Bluesky Scraper?

Most Bluesky scrapers handle one use case. This Actor handles seven — and does it faster, cheaper, and more accurately than browser-based alternatives.

#What makes it betterWhy it matters
17 scrape modes in one actorSearch, author feed, thread, profile, followers, following, and custom feed. Competitors often need separate actors per mode.
2Direct AT Protocol APINo headless browser, no proxies, no fragile HTML parsing. Pure API calls = faster runs, lower cost, higher reliability.
3Engagement rate auto-calculatedIn authorFeed mode, follower count is pre-fetched once and engagementRate = (likes + reposts + quotes) / followers is computed per post. No spreadsheet formulas needed.
4Structured hashtags and mentions via facetsTags and mentions are extracted from AT Protocol facets — machine-readable structured data, not regex on raw text. Always accurate, even for posts with special characters or non-Latin scripts.
5Custom feed generator supportScrape any public Bluesky algorithm (What's Hot, topic feeds, community feeds) by URI. Unique to Bluesky's open protocol.
6No login required100% public API. No credentials, no session tokens, no risk of account bans.
7Stable AT URIs as post IDsEvery post gets its postId as a permanent AT URI (e.g. at://did:plc:.../...). These decentralized identifiers never break when users change their handle.
8Smart rate-limit handlingAutomatically honors Retry-After headers from the API. No wasted retries, no failed runs.
9Full media extractionImage URLs, alt texts, and video thumbnails are extracted and indexed per post.
10MCP-readyWorks out of the box with Claude Desktop via Apify's MCP server. Run natural-language Bluesky queries without writing a single line of code.

What does Bluesky Scraper do?

The Actor connects directly to the AT Protocol public AppView API (api.bsky.app) and supports seven distinct scrape modes:

ModeWhat it collects
searchPostsPosts matching a keyword, phrase, or hashtag — with optional date range and language filters
authorFeedAll posts (and optionally reposts/replies) from one or more Bluesky accounts
postThreadA full post thread including all nested replies
profileDetailed profile stats for one or more accounts
followersThe complete follower list of an account
followsThe complete following list of an account
customFeedAny public Bluesky feed generator, identified by its AT URI

All modes support automatic cursor-based pagination to collect results beyond the first page.

Common use cases:

  • Brand monitoring — track mentions of your product, company, or topic in real time
  • Research and journalism — snapshot public conversations with date-range filters
  • Influencer discovery — surface high-engagement accounts via follower and engagement metrics
  • Academic datasets — build labeled corpora for NLP, sentiment analysis, or network studies
  • Competitor intelligence — monitor what your industry is discussing on Bluesky
  • Content curation — find top posts on any hashtag ranked by engagement

How to use Bluesky Scraper

  1. Click Try for free to open the Actor input form
  2. Select a Scrape mode from the dropdown (e.g. searchPosts)
  3. Enter your search query (e.g. #AI or climate change) or one or more Bluesky handles
  4. Set Max items to control how many results to collect
  5. Optionally add date filters, a language filter, or a custom feed URI
  6. Click Start and find your structured results in the Output tab — downloadable as JSON, CSV, or Excel

Input

All fields are configured in the Input tab or via JSON.

FieldRequiredDescription
modeYesOne of: searchPosts, authorFeed, postThread, profile, followers, follows, customFeed
searchQueryFor searchPostsKeyword, phrase, or hashtag — e.g. #AI or "machine learning"
handlesFor authorFeed, profile, followers, followsOne or more Bluesky handles, e.g. ["user.bsky.social"]
postUrlFor postThreadFull post URL or AT URI
feedUriFor customFeedAT URI of the feed generator, e.g. at://did:plc:z72i7hd/app.bsky.feed.generator/whats-hot
maxItemsNoMax items to collect (default: 100, 0 = unlimited; UI maximum: 10,000)
searchSortNolatest or top (default: latest)
searchSinceNoStart date filter, e.g. 2024-01-01
searchUntilNoEnd date filter, e.g. 2024-12-31
searchLangNoISO 639-1 language code, e.g. en, de, ja
includeRepliesNoInclude replies in author feed (default: false)
includeRepostsNoInclude reposts in author feed (default: false)

Example — hashtag search:

{
"mode": "searchPosts",
"searchQuery": "#AI",
"searchSort": "latest",
"searchLang": "en",
"maxItems": 500
}

Example — custom feed:

{
"mode": "customFeed",
"feedUri": "at://did:plc:z72i7hdynmchkltzmefcsowb/app.bsky.feed.generator/whats-hot",
"maxItems": 100
}

Output

Each item is pushed to the default Apify dataset. Download results as JSON, CSV, HTML, or Excel from the Storage tab.

Example post output item:

{
"postId": "at://did:plc:abc123/app.bsky.feed.post/3kwikpostabcde",
"url": "https://bsky.app/profile/alice.bsky.social/post/3kwikpostabcde",
"text": "Really enjoying the discourse around #AI safety this week.",
"authorHandle": "alice.bsky.social",
"authorDid": "did:plc:abc123",
"authorDisplayName": "Alice",
"authorFollowerCount": 3200,
"authorFollowingCount": 410,
"authorPostCount": 812,
"authorAvatarUrl": "https://cdn.bsky.app/img/avatar/plain/did:plc:abc123/bafkreiabcdef@jpeg",
"likeCount": 47,
"repostCount": 12,
"replyCount": 8,
"quoteCount": 3,
"engagementRate": 0.0192,
"createdAt": "2024-11-15T14:22:30.000Z",
"indexedAt": "2024-11-15T14:22:31.500Z",
"lang": ["en"],
"hasMedia": false,
"mediaUrls": null,
"mediaAltTexts": null,
"externalUrl": null,
"externalTitle": null,
"inReplyToUri": null,
"inReplyToUrl": null,
"inReplyToHandle": null,
"isRepost": false,
"repostOf": null,
"labels": null,
"tags": ["AI"],
"mentionedDids": [],
"scrapedMode": "authorFeed"
}

Example profile output item (modes: profile, followers, follows):

{
"did": "did:plc:abc123",
"handle": "alice.bsky.social",
"displayName": "Alice",
"description": "Researcher. Bluesky enthusiast.",
"followerCount": 3200,
"followingCount": 410,
"postCount": 812,
"avatarUrl": "https://cdn.bsky.app/img/avatar/plain/did:plc:abc123/bafkreiabcdef@jpeg",
"bannerUrl": null,
"createdAt": "2023-05-01T10:00:00.000Z",
"indexedAt": "2024-01-15T08:30:00.000Z",
"labels": [],
"scrapedMode": "profile"
}

Data table

Post fields (modes: searchPosts, authorFeed, postThread, customFeed)

FieldTypeDescription
postIdstringPermanent AT URI for the post (e.g. at://did:plc:.../app.bsky.feed.post/...)
urlstringPublic bsky.app URL
textstring | nullFull post text
authorHandlestringAuthor's Bluesky handle
authorDidstringAuthor's decentralized identifier
authorDisplayNamestring | nullAuthor's display name
authorFollowerCountinteger | nullFollower count at scrape time. null in searchPosts mode — the search API returns minimal author objects without stats.
authorFollowingCountinteger | nullFollowing count. null in searchPosts mode (same reason).
authorPostCountinteger | nullTotal post count. null in searchPosts mode (same reason).
authorAvatarUrlstring | nullAvatar image URL. null in searchPosts mode (same reason).
likeCountintegerNumber of likes
repostCountintegerNumber of reposts
replyCountintegerNumber of replies
quoteCountintegerNumber of quote posts
engagementRatefloat | null(likes + reposts + quotes) / authorFollowerCount. Populated in authorFeed mode only; null in other modes.
createdAtstring | nullISO 8601 creation timestamp
indexedAtstring | nullISO 8601 timestamp when the post was indexed by Bluesky
langstring[] | nullDetected language codes
hasMediabooleantrue if the post contains images or video
mediaUrlsstring[] | nullImage or video thumbnail URLs
mediaAltTexts(string | null)[] | nullAlt text per image in the same order as mediaUrls; individual entries may be null if no alt text was provided
externalUrlstring | nullEmbedded link card URL
externalTitlestring | nullEmbedded link card title
inReplyToUristring | nullParent post AT URI if this is a reply
inReplyToUrlstring | nullParent post public URL if this is a reply
inReplyToHandlestring | nullParent post author handle. null in searchPosts mode (parent author not embedded in search results).
isRepostbooleantrue if this item is a repost
repostOfstring | nullAT URI of the original post if isRepost is true; otherwise null
labelsstring[] | nullModeration labels applied to the post
tagsstring[]Hashtags extracted from AT Protocol facets (empty array if none)
mentionedDidsstring[]DIDs of accounts mentioned in the post, extracted from facets (empty array if none)
scrapedModestringThe Actor mode that produced this item

Profile fields (modes: profile, followers, follows)

FieldTypeDescription
didstringDecentralized identifier of the account
handlestringBluesky handle (e.g. user.bsky.social)
displayNamestring | nullAccount display name
descriptionstring | nullBio / profile description
followerCountinteger | nullNumber of followers
followingCountinteger | nullNumber of accounts followed
postCountinteger | nullTotal post count
avatarUrlstring | nullProfile picture URL
bannerUrlstring | nullBanner image URL
createdAtstring | nullAccount creation timestamp
indexedAtstring | nullTimestamp when indexed by Bluesky
labelsstring[] | nullModeration labels applied to the account
scrapedModestringThe Actor mode that produced this item

Pricing / Cost estimation

This Actor uses Apify's pay-per-event pricing. Typical costs:

  • Small run (100 posts): ~$0.01–$0.05
  • Medium run (1,000 posts): ~$0.10–$0.30
  • Large run (10,000 posts): ~$1.00–$2.00

New Apify accounts receive a free tier with monthly compute units included. Most small research tasks run entirely for free.


Use with Claude Desktop / Cursor / VS Code (via Apify MCP)

Query Bluesky interactively from Claude Desktop without writing any code — ideal for ad-hoc research and conversational analysis. This Actor is available as an MCP tool through Apify's hosted MCP server, so everything runs in the cloud. No local setup required.

What you need

  • An Apify account (free tier available)
  • Your Apify API token (found under Settings → Integrations)

Setup (one-time)

Add the following to your claude_desktop_config.json:

{
"mcpServers": {
"apify": {
"command": "npx",
"args": ["-y", "@apify/actors-mcp-server"],
"env": {
"APIFY_TOKEN": "your_apify_api_token_here"
}
}
}
}

Config file location:

  • Windows: %APPDATA%\Claude\claude_desktop_config.json
  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json

Restart Claude Desktop — the Bluesky Scraper will appear as an available tool in the chat (hammer icon). When prompting Claude, mention the Actor by name so it picks the right tool.

Example prompts

"Search for the 20 most recent posts about #climatepolicy in English and summarize the main arguments."

"Show me the last 10 posts from @bsky.app and tell me what topics they cover."

"Fetch the full reply thread for this post: https://bsky.app/profile/user.bsky.social/post/3abc123"

"Find the top 50 followers of @atproto.com and list their display names and follower counts."

"Scrape the What's Hot feed and tell me the top 5 trending topics right now."


Tips and advanced options

  • Date range filtering (searchSince / searchUntil) narrows search results significantly and reduces unnecessary API calls for historical research.
  • Language filtering (searchLang) is especially useful for multilingual topics like #AI — use en for English-only results.
  • Author feed mode with includeReplies: false (default) produces cleaner top-level post datasets without conversation noise.
  • Thread mode is ideal for journalists and researchers who need the full reply context for a specific post.
  • Engagement rate (engagementRate) is only populated in authorFeed mode because follower count requires a separate profile lookup — this is done once per handle and cached for the entire run.
  • Custom feed URIs can be found in the Bluesky app: open any feed, tap the three-dot menu, and copy the link. The AT URI appears in the share URL.
  • Set maxItems: 0 for unlimited scraping — use with care on large accounts or popular hashtags.

FAQ, disclaimers, and support

Is scraping Bluesky legal? This Actor accesses only the public AT Protocol API — the same API used by official Bluesky clients. All data collected is publicly available. You remain responsible for complying with Bluesky's Terms of Service and applicable data protection regulations (GDPR, CCPA, etc.) in your jurisdiction.

Can I scrape private accounts or DMs? No. This Actor only accesses public data via the public AppView API. Private accounts and direct messages are not accessible.

Why are some fields null in searchPosts mode? Bluesky's search API returns minimal author objects that do not include follower/following/post counts or avatar URLs. Fields like authorFollowerCount, authorFollowingCount, authorPostCount, and authorAvatarUrl are null for search results. Use authorFeed mode if you need full author stats.

What rate limits apply? The AT Protocol AppView API has generous public rate limits. The Actor handles rate-limit responses (HTTP 429) automatically by honoring the Retry-After header — no run failures due to throttling.

I need a custom feature or bulk data pipeline. Open an issue on the Issues tab or contact us via the Apify platform — custom solutions are available.

Known limitations

  • Search result counts may be approximate; the AT Protocol API does not guarantee exact pagination totals
  • The searchUntil / searchSince date parameters may occasionally return slightly out-of-range results — this is known AT Protocol behavior
  • Video content is supported (thumbnail URL extracted) but full video file download is not included