Bluesky Scraper - Profiles, Posts, Followers, Search avatar

Bluesky Scraper - Profiles, Posts, Followers, Search

Pricing

Pay per usage

Go to Apify Store
Bluesky Scraper - Profiles, Posts, Followers, Search

Bluesky Scraper - Profiles, Posts, Followers, Search

Scrape Bluesky via the official AT Protocol: profiles, posts, post search, followers, following, threads & custom feeds. No proxy required, no anti-bot — official open API. App password optional.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Khalil Drissi

Khalil Drissi

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a day ago

Last modified

Categories

Share

Bluesky Scraper — Profiles, Posts, Followers & Search via the Open AT Protocol

Scrape any public Bluesky data through the official AT Protocol API — no proxies, no anti-bot fights, no terms-of-service grey areas. One actor covers seven use cases: profile scraping, post feeds, keyword search, follower/following graphs, thread expansion, and custom feed scraping.


Features

FeatureDetail
7 scraping modesprofile · posts · search · followers · following · thread · feed
No proxy neededOfficial public API; Bluesky does not block scrapers
Auth optionalMost modes work without a Bluesky account; search & custom feeds work best with an app password
Structured outputTyped JSON records with consistent field names; covers text, media, embeds, counts
PaginationAutomatically pages through all results up to your maxItems cap
ResilientPer-item error isolation; exponential backoff on rate limits (HTTP 429)
Pay-per-eventOnly pay for what you scrape — profiles, posts, or connections
MIT-licensed APIUses the MarshalX atproto Python SDK

Why Bluesky scraping beats Twitter / X scraping

AspectBluesky (this actor)Twitter / X
API typeOfficial, documented AT ProtocolReverse-engineered / unofficial
Anti-bot measuresNone — open protocolHeavy: CAPTCHAs, rate-limit bans, IP blocks
Proxy costZero — direct connectionHigh — residential proxies often required
Legal standingPublic data, open protocol, no ToS conflictGrey area; ToS explicitly prohibits scraping
AuthenticationApp password optional (free)Paid API tiers ($100–$5,000/mo)
Data freshnessReal-timeDelayed or restricted on free tiers

Use Cases

  • AI training datasets — collect large-scale post corpora with text, language tags, and engagement signals.
  • Social media analytics — track follower growth, post volume, engagement rates across accounts.
  • Journalism & OSINT — search posts by keyword and date range, expand threads for context.
  • Brand monitoring — monitor mentions of a brand, product, or topic across the network.
  • AT Protocol research — study the social graph, feed algorithms, or labeling systems.

Input

Mode

Select exactly one mode per run. Combine modes by running the actor multiple times (trivially parallelizable on the Apify platform).

ModeWhat it returnsRequired input fields
profileFull profile records for one or more handles/DIDshandles
postsRecent posts by one or more accountshandles, maxItems
searchPosts matching a keyword querysearchQuery, maxItems (+ auth recommended)
followersAccounts that follow a handlehandles, maxItems
followingAccounts a handle followshandles, maxItems
threadFull reply tree for a post URLpostUrls
feedPosts from a custom feed generatorfeedUrls, maxItems (+ auth recommended)

All input fields

FieldTypeDefaultDescription
modeenumprofileScraping mode (required)
handlesstring[]Bluesky handles or DIDs; used by profile, posts, followers, following
searchQuerystringKeyword/phrase to search (search mode)
postUrlsstring[]Post URLs or at:// URIs (thread mode)
feedUrlsstring[]Feed generator at:// URIs (feed mode)
maxItemsinteger100Max records per handle/query/feed
searchSincestringISO date lower bound for search (e.g. 2024-01-01)
searchUntilstringISO date upper bound for search
searchLanguagestringBCP-47 language filter for search (e.g. en)
searchSortenumlatestlatest or top
threadDepthinteger6How many reply levels to expand (max 1000)
threadParentHeightinteger80How many parent levels to walk up (max 1000)
blueskyHandlestringYour Bluesky handle (optional auth)
blueskyAppPasswordstringApp password — see Authentication section
proxyConfigurationobjectOptional Apify Proxy (rarely needed)

Example inputs

Profile scrape (no auth needed):

{
"mode": "profile",
"handles": ["bsky.app", "jay.bsky.team", "pfrazee.com"]
}

Keyword search:

{
"mode": "search",
"searchQuery": "open source AI",
"maxItems": 200,
"searchSort": "latest",
"searchLanguage": "en",
"blueskyHandle": "alice.bsky.social",
"blueskyAppPassword": "xxxx-xxxx-xxxx-xxxx"
}

Follower list:

{
"mode": "followers",
"handles": ["bsky.app"],
"maxItems": 500
}

Thread expansion:

{
"mode": "thread",
"postUrls": [
"https://bsky.app/profile/bsky.app/post/3laahvvjbek2j"
],
"threadDepth": 10
}

Recent posts by a user:

{
"mode": "posts",
"handles": ["pfrazee.com"],
"maxItems": 100
}

Output

Record families

The actor produces three types of records, all tagged with a mode field.

Profile record (mode: "profile")

FieldTypeExample
modestring"profile"
scrapedAtISO datetime"2024-11-15T12:34:56.789Z"
didstring"did:plc:z72i7hdynmk6r22z27h6tvur"
handlestring"bsky.app"
displayNamestring|null"Bluesky"
descriptionstring|null"What's up?"
followersCountinteger1234567
followsCountinteger42
postsCountinteger891
avatarurl|null"https://cdn.bsky.app/img/avatar/..."
bannerurl|null"https://cdn.bsky.app/img/banner/..."
createdAtISO datetime|null"2022-11-17T00:00:00.000Z"
indexedAtISO datetime|null"2024-01-01T09:00:00.000Z"
labelsstring[][]
pinnedPostUristring|null"at://did:plc:.../app.bsky.feed.post/..."
profileUrlurl"https://bsky.app/profile/bsky.app"

Example:

{
"mode": "profile",
"scrapedAt": "2024-11-15T12:34:56.789000+00:00",
"did": "did:plc:z72i7hdynmk6r22z27h6tvur",
"handle": "bsky.app",
"displayName": "Bluesky",
"description": "What's up?",
"followersCount": 1423891,
"followsCount": 48,
"postsCount": 912,
"avatar": "https://cdn.bsky.app/img/avatar/plain/did:plc:z72i7hdynmk6r22z27h6tvur/bafkreiabcd@jpeg",
"banner": null,
"createdAt": "2022-11-17T00:00:00.000Z",
"indexedAt": "2024-01-01T09:00:00.000Z",
"labels": [],
"pinnedPostUri": null,
"profileUrl": "https://bsky.app/profile/bsky.app"
}

Post record (modeposts / search / thread / feed)

FieldTypeExample
modestring"posts"
scrapedAtISO datetime"2024-11-15T12:34:56Z"
uristring"at://did:plc:.../app.bsky.feed.post/3la..."
cidstring"bafyreia..."
authorDidstring"did:plc:..."
authorHandlestring"alice.bsky.social"
authorDisplayNamestring|null"Alice"
textstring"Hello Bluesky!"
createdAtISO datetime|null"2024-11-15T10:00:00.000Z"
indexedAtISO datetime|null"2024-11-15T10:00:01.000Z"
langsstring[]["en"]
replyCountinteger12
repostCountinteger34
likeCountinteger156
quoteCountinteger5
bookmarkCountinteger|nullnull
isRepostbooleanfalse
isReplybooleanfalse
replyParentUristring|nullnull
replyRootUristring|nullnull
imagesarray[{"fullsize": "...", "thumb": "...", "alt": "..."}]
externalLinkobject|null{"uri": "...", "title": "...", "description": "..."}
quotedPostUristring|nullnull
quotedPostTextstring|nullnull
videoobject|null{"playlist": "...", "thumbnail": "..."}
postUrlurl|null"https://bsky.app/profile/alice.bsky.social/post/3la..."

Connection record (modefollowers / following)

FieldTypeExample
modestring"followers"
scrapedAtISO datetime"2024-11-15T12:34:56Z"
subjectDidstring"did:plc:..." (the queried account)
subjectHandlestring"bsky.app" (the queried account)
didstring"did:plc:..." (the follower/followed)
handlestring"bob.bsky.social"
displayNamestring|null"Bob"
avatarurl|null"https://cdn.bsky.app/img/avatar/..."
descriptionstring|null"Builder."

Pricing

This actor uses pay-per-event pricing — you only pay for records actually pushed to the dataset.

EventPriceWhen charged
Profile scraped$0.002One full profile record (profile mode)
Post scraped$0.0004One post record (posts, search, thread, feed)
Connection scraped$0.0002One follower/following edge (followers, following)

Example costs

TaskRecordsCost
1,000 profiles1,000 × $0.002$2.00
10,000 posts (keyword search)10,000 × $0.0004$4.00
50,000 follower records50,000 × $0.0002$10.00
500 profiles + 5,000 posts500 × $0.002 + 5,000 × $0.0004$3.00

Note: Pay-per-event pricing takes 14 days to take effect after monetization is configured.


Authentication

Most modes (profile, posts, followers, following, thread) work without a Bluesky account.

For search and feed modes, the public AppView may require authentication. Provide:

  • blueskyHandle — your Bluesky handle, e.g. alice.bsky.social
  • blueskyAppPassword — an app password (NOT your main account password)

Creating an app password

  1. Log in to bsky.app
  2. Go to Settings → App Passwords: https://bsky.app/settings/app-passwords
  3. Click Add App Password, give it a name (e.g. "Apify Scraper"), and copy the generated code.
  4. The format is xxxx-xxxx-xxxx-xxxx. Paste it into the blueskyAppPassword input field.

App passwords have limited scope (no DM access, no account deletion) and can be revoked individually at any time without affecting your account. Never enter your main Bluesky password.


Rate Limits & Throughput

Bluesky's public AppView offers generous rate limits for read-only access. The actor:

  • Fetches up to 100 records per API request (the maximum page size).
  • Automatically retries on HTTP 429 (rate limit) and 5xx errors with exponential backoff, honouring Retry-After headers when present.
  • Isolates failures per item — one failed profile/post/edge does not stop the rest of the run.

Practical throughput: expect tens of thousands of records per run without hitting limits, depending on your account's tier and the specific endpoints used. For very large runs (100k+ records), add authentication to benefit from higher per-account limits.


FAQ

1. Will the actor slow down or get blocked at high volumes? No blocking — this is an official API. If you hit a rate limit, the actor backs off automatically and retries. For sustained high-volume runs, provide an app password to use per-account limits (higher than per-IP limits).

2. Do I need a Bluesky account? No, for most modes. Profile, posts, followers, following, and thread modes all work unauthenticated. Search and custom-feed modes may return a 401 if used without credentials — just add blueskyHandle + blueskyAppPassword in that case.

3. What data is public on Bluesky? All posts, profiles, follower graphs, and custom feeds are public by design (AT Protocol is an open, federated network). DMs and muted/blocked relationships are not accessible via the public API.

4. Why might a field be null? A few reasons: the post was deleted before scraping, the account is deactivated, the field is viewer-scoped and requires auth (e.g. bookmarkCount), or the field is simply optional in the AT Protocol schema (e.g. banner, displayName).

5. How is this better than scraping Twitter/X? No reverse-engineering, no CAPTCHA, no residential proxy costs, no ToS violation. Bluesky's AT Protocol is documented and open; this actor uses the same API Bluesky's own apps use. See the comparison table at the top of this README.


Bluesky is built on the open AT Protocol. All data scraped by this actor is publicly accessible on the network by design.

  • Respect creators: only use scraped content in ways consistent with the rights of the people who created it.
  • GDPR / CCPA: if you process personal data from EU or California residents, you are responsible for complying with applicable data-protection law (legal basis, retention limits, subject rights, etc.).
  • Redistribution: do not redistribute or commercially exploit scraped content beyond what is permitted by applicable law and the relevant terms.
  • Rate limits: do not intentionally bypass rate limits or attempt to extract data at a rate that degrades service for other users.