Bluesky Scraper — Posts, Profiles & Search avatar

Bluesky Scraper — Posts, Profiles & Search

Pricing

$3.00 / 1,000 result scrapeds

Go to Apify Store
Bluesky Scraper — Posts, Profiles & Search

Bluesky Scraper — Posts, Profiles & Search

Scrape Bluesky profiles and posts — extract handle, bio, followers, following, post text, likes, reposts, and timestamps. CSV/JSON output. No API key.

Pricing

$3.00 / 1,000 result scrapeds

Rating

0.0

(0)

Developer

Web Data Labs

Web Data Labs

Maintained by Community

Actor stats

0

Bookmarked

16

Total users

3

Monthly active users

3 days ago

Last modified

Categories

Share

Bluesky Scraper — Posts, Profiles, Threads & Followers

The most comprehensive Bluesky scraper on Apify. Extract posts, profiles, threads, followers, following lists, and custom feeds from Bluesky at scale. No login required. Uses the official AT Protocol for fast, reliable data collection.

Why Use This Scraper?

Bluesky has exploded to over 30 million users and is now a critical platform for tech, politics, journalism, and culture. Unlike X/Twitter, Bluesky is built on the open AT Protocol — but navigating the raw protocol is complex and time-consuming.

This actor gives you a simple, powerful interface to extract any Bluesky data. Search by keyword, scrape specific users, pull entire conversation threads, or tap into custom algorithm feeds — all with clean, structured output ready for analysis.

Key Features

  • 6 scrape modes: Posts, profiles, both, followers, following, and threads
  • Keyword search: Find posts across all of Bluesky with sort by latest or top
  • Handle-based scraping: Target specific users for posts, profiles, or social graphs
  • Thread extraction: Pull full conversation threads with configurable depth
  • Custom feed support: Scrape any Bluesky algorithm/custom feed
  • Advanced filters: Date range, language, author, domain, mentions, hashtags
  • Author enrichment: Optionally resolve full author profiles with follower/post counts
  • No login required: Uses public AT Protocol endpoints
  • Scales to 50,000 items: Handle large-scale data collection
  • Multiple export formats: JSON, CSV, Excel, XML, HTML

Use Cases

1. Social Media Monitoring & Brand Tracking

Track mentions of your brand, product, or competitors across Bluesky. Use keyword search with date filters to detect spikes in conversation and sentiment shifts in real time.

2. Influencer Discovery & Analysis

Search for posts about your industry, then use the resolveAuthors option to get follower counts and posting frequency. Identify high-engagement accounts for partnerships or outreach.

3. Political & Journalism Research

Bluesky has become a hub for political discourse and journalism. Scrape posts by topic, track specific journalists or politicians, and analyze conversation threads around breaking news.

4. Competitive Intelligence

Monitor what users say about competitor products. Use domain filters to find posts linking to competitor websites. Track sentiment and feature requests mentioned in public posts.

5. Trend Analysis & Content Strategy

Analyze what topics, hashtags, and content formats perform best on Bluesky. Use the top sort to find high-engagement posts and study what makes them successful.

6. Academic & Social Science Research

Collect large datasets of public discourse for academic studies. Filter by language, date range, and topic. Export structured data for quantitative analysis in R, Python, or SPSS.

7. Community Mapping

Scrape followers and following lists to map social graphs. Understand community structures, identify key connectors, and analyze how information flows through networks.

8. Content Aggregation & Curation

Pull posts from custom feeds or specific hashtags to power content aggregation tools, newsletters, or dashboards. Schedule recurring runs for automated content pipelines.

Input Parameters

ParameterTypeRequiredDefaultDescription
searchQueryStringNoKeyword to search for in posts. Example: artificial intelligence, crypto
handlesArray of stringsNoBluesky handles to scrape (e.g., jay.bsky.team). Required for posts/profiles/followers/following modes
scrapeTypeStringNopostsWhat to scrape: posts, profiles, both, followers, following
maxItemsIntegerNo100Maximum results per search/handle/feed (1–50,000)
sortStringNolatestSort order: latest (newest first) or top (most relevant)
sinceStringNoOnly posts after this date. Format: YYYY-MM-DD
untilStringNoOnly posts before this date. Format: YYYY-MM-DD
langStringNoFilter by language code (e.g., en, ja, pt)
authorStringNoFilter search results to posts by this handle only
domainStringNoFilter posts containing links to this domain (e.g., github.com)
mentionsStringNoFilter posts mentioning this handle
tagArray of stringsNoFilter by hashtags (without #). Example: ["ai", "tech"]
threadUrisArray of stringsNoAT-URI(s) of posts to scrape full threads
threadDepthIntegerNo6Reply depth for thread scraping (1–1,000)
feedUrisArray of stringsNoCustom feed/algorithm URIs to scrape
feedFilterStringNoAllAuthor feed filter: posts_with_replies, posts_no_replies, posts_with_media, posts_and_author_threads
resolveAuthorsBooleanNofalseFetch full author profiles (adds follower/following/post counts to each post)

Sample Output

Posts

{
"authorHandle": "jay.bsky.team",
"authorDisplayName": "Jay Graber",
"text": "Excited to announce Bluesky's new custom feeds feature...",
"createdAt": "2026-03-08T14:22:00.000Z",
"language": "en",
"likeCount": 1247,
"repostCount": 389,
"replyCount": 156,
"quoteCount": 87,
"engagementScore": 1879,
"hasMedia": true,
"mediaCount": 2,
"mediaUrls": ["https://cdn.bsky.app/img/feed/..."],
"isReply": false,
"hashtags": ["bluesky", "atprotocol"],
"postUrl": "https://bsky.app/profile/jay.bsky.team/post/3k...",
"uri": "at://did:plc:.../app.bsky.feed.post/3k...",
"authorFollowersCount": 485000,
"authorPostsCount": 3200
}

Profiles

{
"handle": "jay.bsky.team",
"displayName": "Jay Graber",
"description": "CEO @bluesky. Building the AT Protocol.",
"avatar": "https://cdn.bsky.app/img/avatar/...",
"banner": "https://cdn.bsky.app/img/banner/...",
"followersCount": 485000,
"followingCount": 1200,
"postsCount": 3200,
"createdAt": "2023-02-15T00:00:00.000Z",
"did": "did:plc:...",
"profileUrl": "https://bsky.app/profile/jay.bsky.team"
}

Followers / Following

{
"handle": "user.bsky.social",
"displayName": "Active User",
"description": "Tech enthusiast and developer",
"avatar": "https://cdn.bsky.app/img/avatar/...",
"followersCount": 1200,
"followingCount": 800,
"postsCount": 450,
"did": "did:plc:..."
}

Integration Examples

Python

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
# Search for posts about AI
run_input = {
"searchQuery": "artificial intelligence",
"scrapeType": "posts",
"maxItems": 200,
"sort": "top",
"lang": "en",
"since": "2026-03-01",
}
run = client.actor("cryptosignals/bluesky-scraper").call(run_input=run_input)
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"@{item['authorHandle']}: {item['text'][:80]}... ({item['likeCount']} likes)")
# Get followers of a specific account
run_input = {
"handles": ["jay.bsky.team"],
"scrapeType": "followers",
"maxItems": 1000,
}
run = client.actor("cryptosignals/bluesky-scraper").call(run_input=run_input)
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"@{item['handle']}{item.get('followersCount', 0)} followers")

Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
// Search Bluesky posts
const input = {
searchQuery: "startup funding",
scrapeType: "posts",
maxItems: 100,
sort: "latest",
};
const run = await client.actor("cryptosignals/bluesky-scraper").call(input);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach(item => {
console.log(`@${item.authorHandle}: ${item.text.substring(0, 80)}...`);
console.log(` Likes: ${item.likeCount} | Reposts: ${item.repostCount}`);
});
// Scrape a conversation thread
const threadInput = {
threadUris: ["at://did:plc:abc123/app.bsky.feed.post/xyz789"],
threadDepth: 10,
};
const threadRun = await client.actor("cryptosignals/bluesky-scraper").call(threadInput);
const { items: threadItems } = await client.dataset(threadRun.defaultDatasetId).listItems();
console.log(`Thread has ${threadItems.length} posts`);

Using the Apify API Directly

curl -X POST "https://api.apify.com/v2/acts/cryptosignals~bluesky-scraper/runs?token=YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"searchQuery": "AI agents",
"scrapeType": "posts",
"maxItems": 50,
"sort": "top",
"lang": "en"
}'

Pricing & Costs

This actor runs on the Apify platform using your account's compute units (CUs).

ScenarioEstimated Cost
100 posts (keyword search)~$0.01–$0.02
500 posts with author enrichment~$0.05–$0.10
1,000 followers of an account~$0.02–$0.05
5,000 posts (large dataset)~$0.10–$0.25

This actor uses the public AT Protocol API and does not require residential proxies, keeping costs low. The free Apify plan ($5/month in credits) is enough for most use cases.

Tips for Best Results

  1. Combine search with filters: Use searchQuery with lang, since, and sort for precise results.
  2. Enable author enrichment selectively: resolveAuthors: true adds follower counts but increases API calls. Use it when you need engagement-to-audience ratios.
  3. Use handles for targeted scraping: When you know which accounts to monitor, use handles instead of search for more complete data.
  4. Thread scraping: Get the AT-URI from a post's URL (available in post output as uri field), then use threadUris to pull the full conversation.
  5. Custom feeds: Find feed URIs from the Bluesky app (share a feed > copy link) and pass them to feedUris.
  6. Schedule for monitoring: Set up hourly or daily scrapes with Apify scheduling for continuous brand monitoring or trend tracking.

Frequently Asked Questions

Do I need a Bluesky account to use this scraper?

No. This scraper uses the public AT Protocol API, which doesn't require authentication. All publicly available data can be scraped without a login.

Can I scrape private/blocked accounts?

No. The scraper only accesses publicly available data through the AT Protocol. Private accounts or blocked content is not accessible.

What's the maximum number of items I can scrape?

Up to 50,000 items per run. For larger datasets, split across multiple runs with date filters or different search queries.

How do I find a user's Bluesky handle?

Go to their profile on bsky.app — the handle is in the URL (e.g., bsky.app/profile/jay.bsky.team). Custom domains work too (e.g., jay.bsky.team instead of jay.bsky.social).

What is an AT-URI and how do I get one?

AT-URIs are the internal identifiers for Bluesky content (format: at://did:plc:.../app.bsky.feed.post/...). You'll find them in the uri field of scraped posts. Use them for thread scraping.

Can I filter posts by engagement (e.g., only posts with 100+ likes)?

The scraper returns all posts matching your criteria. Filter by engagement after export — in a spreadsheet, Python/pandas, or any data tool.

How does the engagementScore work?

The engagement score is calculated as likes + reposts + replies + quotes. It gives you a single number to quickly identify high-performing posts.

Is the data real-time?

Yes. Every run fetches live data from the Bluesky network. There's no caching or delay — you get the latest available data.

Can I use this for sentiment analysis?

Absolutely. Scrape posts about a topic, export the text field, and run it through any NLP or sentiment analysis tool. The structured output makes it easy to pipe into Python, R, or cloud AI services.

What's the difference between searchQuery and handles?

searchQuery searches across all of Bluesky for matching posts. handles targets specific user accounts. You can use one or both depending on your needs.