Bluesky Scraper — Posts, Profiles & Search avatar

Bluesky Scraper — Posts, Profiles & Search

Pricing

Pay per usage

Go to Apify Store
Bluesky Scraper — Posts, Profiles & Search

Bluesky Scraper — Posts, Profiles & Search

Extracts Bluesky posts, profiles, and search results. Returns post text, like/repost counts, author details, and embedded media links.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

CryptoSignals Agent

CryptoSignals Agent

Maintained by Community

Actor stats

0

Bookmarked

13

Total users

7

Monthly active users

19 hours ago

Last modified

Share

Bluesky Scraper — Posts, Profiles, Followers & Feeds

Extract posts, profiles, threads, followers, following lists, and custom feeds from Bluesky at scale. Uses the public AT Protocol — no API key or login needed.


Free Trial Ending April 3

This actor is free while we collect feedback. Starting April 3, 2026, it moves to $4.99/month. Add a payment method at apify.com/billing to keep access.


What Can You Scrape?

Data TypeDescription
PostsSearch by keyword, hashtag, mention, author, domain, or date range
ProfilesFull user profiles with follower/following counts and bio
FollowersComplete follower list for any handle
FollowingComplete following list for any handle
ThreadsFull post threads with nested replies
Custom FeedsAlgorithmic and user-created feeds

Features

  • Advanced search filters — date range, language, author, domain, hashtags, mentions
  • Sort by latest or top relevance
  • Up to 50,000 items per run
  • Full engagement metrics — likes, reposts, replies, quotes, bookmark count, engagement score
  • Author enrichment — optional follower/following/post counts per author
  • Media extraction — hasMedia flag, media count, flat media URL array
  • Reply detection — isReply flag with parent and root thread URIs
  • Language detection — per-post language field
  • No authentication — uses the public AT Protocol API
  • Robust rate limiting — 5 retries with exponential backoff, Retry-After header support

Input Parameters

FieldTypeDefaultDescription
searchQuerystringKeyword to search, e.g. "artificial intelligence", "#startup"
handlesstring[][]Bluesky handles to scrape, e.g. ["jay.bsky.team"]
scrapeTypestring"posts"posts, profiles, both, followers, or following
maxItemsinteger100Max results per query/handle (1–50,000)
sortstring"latest"latest or top
sincestringStart date (ISO format, e.g. "2026-03-01T00:00:00Z")
untilstringEnd date (ISO format)
langstringLanguage code filter, e.g. "en", "fr"
tagstring[][]Hashtag filters
mentionsstringFilter posts mentioning this handle
authorstringFilter posts by author handle
domainstringFilter posts linking to this domain
threadUrisstring[][]AT Protocol URIs of threads to scrape
feedUrisstring[][]Custom feed URIs to scrape
threadDepthinteger6Max depth when scraping threads
resolveAuthorsbooleanfalseFetch author profiles for follower/following/post counts

Example Inputs

Search posts by keyword

{
"searchQuery": "artificial intelligence",
"scrapeType": "posts",
"maxItems": 200,
"sort": "top",
"lang": "en"
}

Search with author enrichment

{
"searchQuery": "web scraping",
"scrapeType": "posts",
"maxItems": 50,
"resolveAuthors": true
}

Scrape a user's posts and profile

{
"handles": ["jay.bsky.team"],
"scrapeType": "both",
"maxItems": 100
}

Get followers of an account

{
"handles": ["bsky.app"],
"scrapeType": "followers",
"maxItems": 5000
}

Output Fields — Posts

FieldTypeDescription
uristringAT Protocol URI
cidstringContent identifier
authorHandlestringAuthor's Bluesky handle
authorDisplayNamestringAuthor's display name
authorDidstringAuthor's DID
authorAvatarstringAuthor's avatar URL
authorFollowersCountnumber/nullAuthor's follower count (requires resolveAuthors or handle scraping)
authorFollowingCountnumber/nullAuthor's following count
authorPostsCountnumber/nullAuthor's total post count
textstringPost text content
createdAtstringPost creation timestamp (ISO 8601)
indexedAtstringIndex timestamp
languagestring/nullDetected post language (e.g. "en", "ja")
languagesstring[]All language tags on the post
likeCountnumberNumber of likes
repostCountnumberNumber of reposts
replyCountnumberNumber of replies
quoteCountnumberNumber of quotes
bookmarkCountnumberNumber of bookmarks
engagementScorenumberWeighted engagement: likes + reposts×2 + replies×3 + quotes×4
postUrlstringDirect URL to the post on bsky.app
hashtagsstring[]Extracted hashtags
mentionsstring[]Mentioned DIDs
facetLinksstring[]Links extracted from rich text facets
hasMediabooleanWhether post contains images or videos
mediaCountnumberTotal number of images + videos
mediaUrlsstring[]Flat array of all media URLs
imagesobject[]Detailed image data (url, thumb, alt, aspectRatio)
videosobject[]Detailed video data (url, thumb, alt, aspectRatio)
linksobject[]Embedded link cards (url, title, description, thumb)
hasEmbeddedMediabooleanHas images, videos, or link cards
quotedPostobject/nullQuoted post details (uri, author, text)
isRepostbooleanWhether this is a repost of another post
repostedByobject/nullWho reposted (handle, displayName)
isReplybooleanWhether this post is a reply
replyParentUristring/nullURI of the parent post (if reply)
replyRootUristring/nullURI of the thread root post (if reply)

Example Post Output

{
"uri": "at://did:plc:abc.../app.bsky.feed.post/xyz...",
"cid": "bafyrei...",
"authorHandle": "alice.bsky.social",
"authorDisplayName": "Alice",
"authorDid": "did:plc:abc...",
"authorAvatar": "https://cdn.bsky.app/...",
"authorFollowersCount": 5200,
"authorFollowingCount": 340,
"authorPostsCount": 1820,
"text": "Just shipped our new feature and the response has been incredible...",
"createdAt": "2026-03-25T10:30:00.000Z",
"indexedAt": "2026-03-25T10:30:01.000Z",
"language": "en",
"languages": ["en"],
"likeCount": 45,
"repostCount": 12,
"replyCount": 8,
"quoteCount": 3,
"bookmarkCount": 2,
"engagementScore": 105,
"postUrl": "https://bsky.app/profile/alice.bsky.social/post/xyz...",
"hashtags": ["buildinpublic", "startup"],
"mentions": [],
"facetLinks": [],
"hasMedia": true,
"mediaCount": 1,
"mediaUrls": ["https://cdn.bsky.app/img/..."],
"images": [{"url": "https://cdn.bsky.app/img/...", "thumb": "...", "alt": "Screenshot", "aspectRatio": null}],
"videos": [],
"links": [],
"hasEmbeddedMedia": true,
"quotedPost": null,
"isRepost": false,
"repostedBy": null,
"isReply": false,
"replyParentUri": null,
"replyRootUri": null
}

Output Fields — Profiles

FieldTypeDescription
didstringDecentralized identifier
handlestringBluesky handle
displayNamestringDisplay name
descriptionstringBio text
avatarstringAvatar image URL
bannerstringBanner image URL
followersCountnumberNumber of followers
followsCountnumberNumber of accounts followed
postsCountnumberTotal number of posts
createdAtstringAccount creation date
indexedAtstringLast index date
profileUrlstringDirect URL to profile on bsky.app
labelsstring[]Content labels
associatedobjectAssociated data (lists, feeds, etc.)
pinnedPoststring/nullURI of pinned post

Code Examples

Python

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("cryptosignals/bluesky-scraper").call(run_input={
"searchQuery": "AI tools",
"scrapeType": "posts",
"maxItems": 100,
"sort": "top",
"lang": "en",
"resolveAuthors": True,
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"@{item['authorHandle']} ({item.get('authorFollowersCount', '?')} followers): "
f"{item['text'][:80]}... "
f"(engagement: {item['engagementScore']})")

JavaScript

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('cryptosignals/bluesky-scraper').call({
searchQuery: 'AI tools',
scrapeType: 'posts',
maxItems: 100,
sort: 'top',
lang: 'en',
resolveAuthors: true,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach(item => {
console.log(`@${item.authorHandle} (${item.authorFollowersCount ?? '?'} followers): ${item.text.slice(0, 80)}... (engagement: ${item.engagementScore})`);
});

cURL

# Start the run
curl -X POST "https://api.apify.com/v2/acts/cryptosignals~bluesky-scraper/runs?token=YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{"searchQuery": "AI", "scrapeType": "posts", "maxItems": 50}'
# Get results (replace DATASET_ID)
curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"

Use Cases

  • Social listening — monitor brand mentions and sentiment on Bluesky in real-time
  • Audience research — analyze follower demographics and growth for any account
  • Content strategy — find trending topics and high-engagement post patterns
  • Lead generation — discover people discussing topics relevant to your business
  • Academic research — collect public social media data at scale for analysis
  • Competitor monitoring — track what competitors post and how their audience reacts
  • Influencer discovery — find accounts with high engagement in specific niches

FAQ

Do I need a Bluesky account? No. The scraper uses the public AT Protocol API. No login or API key required.

How many results can I get? Up to 50,000 per run. For larger datasets, run multiple searches with different parameters.

What is the engagement score? A weighted metric: likes + reposts×2 + replies×3 + quotes×4. Higher scores indicate more engaging posts. Use it to quickly sort and filter high-value content.

What does resolveAuthors do? When enabled, the scraper fetches full profiles for each unique author, adding follower count, following count, and post count to every post item. This makes extra API calls, so it's opt-in.

Can I get private/blocked content? No. Only publicly available data is scraped.

How fresh is the data? Real-time. Each run fetches live data from the AT Protocol API.

What export formats are supported? JSON, CSV, Excel, XML, HTML. Connect via API, webhooks, Zapier, Make, or Google Sheets.

Integrations

  • REST API — trigger runs and fetch results programmatically
  • Webhooks — get notified when scraping completes
  • Zapier / Make — connect to 5,000+ apps
  • Google Sheets — export directly to spreadsheets
  • Slack / Email — set up alerts for new posts matching your keywords

See all scrapers by CryptoSignals

Using proxies

Bluesky's AT Protocol API enforces per-IP rate limits that trigger HTTP 429 responses during bulk data collection. Once rate-limited, your IP can be throttled for extended periods. Residential proxies distribute requests across real ISP addresses, keeping each IP well under the rate threshold. ThorData provides 200M+ residential IPs that work well for sustained AT Protocol scraping without hitting rate walls.