Bluesky Scraper
Pricing
Pay per usage
Bluesky Scraper
Search posts, get profiles, and extract feeds from Bluesky. Uses AT Protocol API. No login required.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
Tugelbay Konabayev
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
10 hours ago
Last modified
Categories
Share
Bluesky Scraper — Extract Posts, Profiles & Feeds from Bluesky
Search posts by keyword, extract user profiles, and scrape feeds from Bluesky social network using the AT Protocol API. No browser needed — pure API calls make this fast, lightweight, and affordable. Get up to 10,000 results per run in clean, structured JSON. Works with or without authentication.
What It Does
Bluesky Scraper is a 4-in-1 actor for extracting data from Bluesky, the decentralized social network built on AT Protocol. It supports:
- Search Posts — Find posts by keyword, hashtag, or user mention (optional auth)
- Search Users — Find user profiles by name or handle (optional auth)
- Get Profiles — Extract detailed profile data for one or more users (no auth needed)
- Get User Feed — Scrape a user's posts, threads, and media (no auth needed)
Each mode returns clean, structured JSON with post metadata (text, likes, reposts, replies, images, embeds), profile data (handle, followers, bio, joined date), and direct post URLs. Optional authentication enables richer search results, but profiles and feeds work completely unauthenticated — perfect for public data extraction.
Bluesky has 30M+ registered users and is one of the fastest-growing social networks. The AT Protocol is open-source and federated — meaning data is synchronized across multiple servers, ensuring reliability and no single point of failure.
How It Compares to Competitors
| Feature | Our Actor | george.the.developer | automation-lab | botflowtech |
|---|---|---|---|---|
| Search posts | ✓ | ✓ | ✓ | ✓ |
| Search users | ✓ | ✗ | ✗ | ✓ |
| Get profiles | ✓ | ✓ | ✓ | ✓ |
| Get user feed | ✓ | ✓ | ✓ | ✓ |
| Single actor, all modes | ✓ | ✗ (separate) | ✗ (separate) | ✓ |
| Auth optional | ✓ | ✓ | ✗ (required) | ✓ |
| Language filtering | ✓ | ✗ | ✗ | ✗ |
| Feed filtering | ✓ | ✗ | ✗ | ✗ |
| Up to 10K results | ✓ | ✓ | ✓ | ✓ |
| Price per 1K | PPE | $1.50 | Free tier | ~$1.20 |
| Users | New | ~120 | ~90 | ~70 |
| Rating | — | 4.1 ⭐ | 4.3 ⭐ | 3.9 ⭐ |
Why choose our actor:
- 4 modes in 1 actor — search + profiles + feeds without switching between tools
- Optional auth — get started immediately without a Bluesky account
- Language filtering — extract posts in specific languages (en, es, ja, etc.)
- Feed filtering — choose posts only, posts + threads, or media-only feeds
- Clean, structured output — all fields documented with examples
- PPE pricing — pay only for results you use, with first 100 results free
Key Features
✓ Search Posts by Keyword — Find posts, hashtags, discussions. Sort by latest or top engagement. ✓ Search Users — Discover profiles by name or handle. ✓ Get User Profiles — Extract handle, display name, bio, followers, posts count, avatar, banner, joined date. ✓ Get User Feeds — Scrape up to 10,000 posts from any user's feed. ✓ Optional Authentication — Log in to unlock richer search results (recommended for production). ✓ No Auth Needed for Profiles/Feeds — Public data extraction works without login. ✓ Pagination Support — Cursor-based pagination handles large result sets. ✓ Sort & Filter — Sort by latest or top engagement, filter by language, filter feeds by type. ✓ Rich Post Data — Text, author, likes, replies, reposts, quotes, images, embeds, direct URLs, language detection. ✓ Structured JSON Output — All fields documented with clear examples. ✓ Up to 10,000 Results — Scale from small extractions to large datasets. ✓ Error Handling — Graceful fallbacks, rate limit handling, clear error messages. ✓ Fast & Lightweight — Pure API calls, no browser overhead, runs in seconds.
Input Examples
Example 1: Search Posts by Keyword (Latest)
Find 50 recent posts about "web scraping" without authentication.
{"mode": "searchPosts","query": "web scraping","maxItems": 50,"sort": "latest"}
Example 2: Search Posts with Authentication & Language Filter
Find 100 top-engagement posts about "AI" in English, using auth for better results.
{"mode": "searchPosts","query": "AI","maxItems": 100,"sort": "top","language": "en","blueskyHandle": "yourname.bsky.social","blueskyAppPassword": "xxxx-xxxx-xxxx-xxxx"}
Example 3: Search Users
Find 30 user profiles matching "data scientist".
{"mode": "searchUsers","query": "data scientist","maxItems": 30}
Example 4: Get Multiple User Profiles (No Auth Needed)
Extract detailed profile data for 5 specific users.
{"mode": "getProfiles","handles": ["jay.bsky.team","jack.bsky.social","paulmozilla.com","darnelle.bsky.social","pfrazee.com"]}
Example 5: Get User Feed with Media Filter
Extract all media posts from a user's feed (max 10,000).
{"mode": "getUserFeed","handles": ["jack.bsky.social"],"maxItems": 10000,"feedFilter": "posts_with_media"}
Input Parameters
| Parameter | Type | Description | Default | Required |
|---|---|---|---|---|
| mode | string | Scraping mode: searchPosts, searchUsers, getProfiles, or getUserFeed | searchPosts | No |
| query | string | Search query for posts or users. Examples: "web scraping", "#tech", "from:user". Only used in search modes. | — | Conditional* |
| handles | string array | Bluesky handles for profile/feed modes. Examples: ["jay.bsky.team", "jack.bsky.social"]. Only used in getProfiles and getUserFeed. | — | Conditional* |
| maxItems | integer | Maximum number of results to return. Min: 1, Max: 10,000 | 100 | No |
| sort | string | Sort order for search results: latest or top (most engagement). Only applies to searchPosts mode. | latest | No |
| language | string | Filter posts by language code (e.g., en, es, ja, de, fr). Leave empty for all languages. Only applies to searchPosts mode. | — | No |
| blueskyHandle | string | Your Bluesky handle for authentication (e.g., yourname.bsky.social). Required for search modes to unlock richer results. Not needed for profiles/feeds. Create app password at bsky.app/settings/app-passwords. | — | No |
| blueskyAppPassword | string | App password for authentication (NOT your main password). Create at bsky.app/settings/app-passwords. Required if blueskyHandle is provided. Secret field — not logged. | — | No |
| feedFilter | string | Feed type filter for getUserFeed mode only. Options: posts_and_author_threads (posts + threads), posts_no_replies (posts only), posts_with_media (media only). | posts_and_author_threads | No |
*Conditional: query is required for searchPosts and searchUsers. handles is required for getProfiles and getUserFeed.
Output Format
The actor returns two views: Posts and Profiles. Choose the view relevant to your mode.
Posts View (searchPosts, getUserFeed modes)
{"mode": "posts","data": [{"uri": "at://did:plc:abc123/app.bsky.feed.post/abc123","cid": "bafy...","authorHandle": "jack.bsky.social","authorDid": "did:plc:abc123","authorDisplayName": "Jack Dorsey","authorAvatar": "https://cdn.bsky.app/img/...","text": "Bluesky is an open social network built on an open protocol. It's now open to everyone.","likeCount": 5234,"replyCount": 890,"repostCount": 2103,"quoteCount": 456,"createdAt": "2024-03-15T14:22:33.000Z","indexedAt": "2024-03-15T14:25:10.000Z","language": "en","images": [{"url": "https://cdn.bsky.app/img/...","alt": "Screenshot of Bluesky"}],"embeds": [{"type": "link","title": "Bluesky Homepage","description": "The open social network","url": "https://bsky.app"}],"postUrl": "https://bsky.app/profile/jack.bsky.social/post/abc123","parentPostUri": null,"rootPostUri": null,"isReply": false,"isRepost": false}]}
Profiles View (getProfiles, searchUsers modes)
{"mode": "profiles","data": [{"did": "did:plc:abc123","handle": "jack.bsky.social","displayName": "Jack Dorsey","bio": "Founder of Twitter and Square. Now building Bluesky.","avatar": "https://cdn.bsky.app/img/...","banner": "https://cdn.bsky.app/img/...","followersCount": 125000,"followsCount": 340,"postsCount": 2100,"createdAt": "2023-04-15T10:20:15.000Z","indexedAt": "2024-03-20T09:15:22.000Z","viewer": {"isMuted": false,"isBlocked": false},"profileUrl": "https://bsky.app/profile/jack.bsky.social"}]}
Example Output
Search Posts Result
{"mode": "searchPosts","query": "web scraping","resultsCount": 3,"data": [{"uri": "at://did:plc:xyz789/app.bsky.feed.post/xyz789","cid": "bafy...","authorHandle": "scraper_dev.bsky.social","authorDid": "did:plc:xyz789","authorDisplayName": "Scraper Dev","authorAvatar": "https://cdn.bsky.app/img/...","text": "Just launched my new web scraping library. Check it out! #development #python","likeCount": 245,"replyCount": 18,"repostCount": 67,"quoteCount": 12,"createdAt": "2024-03-20T16:30:00.000Z","indexedAt": "2024-03-20T16:31:05.000Z","language": "en","images": [],"embeds": [{"type": "link","title": "SuperScraper - Python Web Scraping","description": "Fast and efficient web scraping library","url": "https://github.com/scraper-dev/superscraper"}],"postUrl": "https://bsky.app/profile/scraper_dev.bsky.social/post/xyz789","parentPostUri": null,"rootPostUri": null,"isReply": false,"isRepost": false}]}
Get Profiles Result
{"mode": "getProfiles","handles": ["jack.bsky.social"],"resultsCount": 1,"data": [{"did": "did:plc:eauuyk...","handle": "jack.bsky.social","displayName": "Jack Dorsey","bio": "Former CEO of Twitter, founder of Bluesky","avatar": "https://cdn.bsky.app/img/eauuyk.../avatar_32x32.jpg","banner": "https://cdn.bsky.app/img/eauuyk.../banner_1200x400.png","followersCount": 125432,"followsCount": 340,"postsCount": 2089,"createdAt": "2023-04-16T08:30:21.000Z","indexedAt": "2024-03-20T14:05:18.000Z","viewer": {"isMuted": false,"isBlocked": false},"profileUrl": "https://bsky.app/profile/jack.bsky.social"}]}
Code Examples
Python
import jsonimport asynciofrom apify_client import ApifyClientasync def scrape_bluesky():client = ApifyClient("YOUR_APIFY_TOKEN")# Search posts about web scrapingrun = await client.actor("tugelbay/bluesky-scraper").call({"mode": "searchPosts","query": "web scraping","maxItems": 50,"sort": "latest"})# Fetch resultsdataset = await client.dataset(run["defaultDatasetId"]).list_items()for post in dataset["items"]:print(f"@{post['authorHandle']}: {post['text'][:50]}...")print(f" Likes: {post['likeCount']}, Replies: {post['replyCount']}\n")# Runasyncio.run(scrape_bluesky())
JavaScript
import { ApifyClient } from "apify-client";const client = new ApifyClient({token: "YOUR_APIFY_TOKEN",});(async () => {// Get profiles for multiple usersconst run = await client.actor("tugelbay/bluesky-scraper").call({mode: "getProfiles",handles: ["jack.bsky.social", "pfrazee.com", "paulmozilla.com"],});// Process resultsconst dataset = await client.dataset(run.defaultDatasetId).listItems();dataset.items.forEach((profile) => {console.log(`${profile.displayName} (@${profile.handle})`);console.log(`Followers: ${profile.followersCount}`);console.log(`Posts: ${profile.postsCount}\n`);});})();
LangChain Integration
from langchain.tools import toolfrom apify_client import ApifyClient@tooldef search_bluesky_posts(query: str, max_items: int = 100) -> list:"""Search Bluesky posts by keyword and return results."""client = ApifyClient("YOUR_APIFY_TOKEN")run = client.actor("tugelbay/bluesky-scraper").call({"mode": "searchPosts","query": query,"maxItems": max_items,"sort": "top"})dataset = client.dataset(run["defaultDatasetId"]).list_items()return [{"author": item["authorHandle"],"text": item["text"],"engagement": item["likeCount"] + item["replyCount"] + item["repostCount"]}for item in dataset["items"]]# Use in an agentresults = search_bluesky_posts("AI trends 2024", max_items=50)for post in results:print(f"{post['author']}: {post['engagement']} engagement")
MCP Server Integration
from mcp.server import Serverfrom apify_client import ApifyClientapp = Server("bluesky-mcp")client = ApifyClient("YOUR_APIFY_TOKEN")@app.call_tool()async def bluesky_search(query: str, mode: str = "searchPosts"):"""MCP tool: Search Bluesky posts and profiles."""run = await client.actor("tugelbay/bluesky-scraper").call({"mode": mode,"query": query,"maxItems": 100,"sort": "top"})dataset = await client.dataset(run["defaultDatasetId"]).list_items()return {"results": dataset["items"], "count": len(dataset["items"])}
Use Cases
1. Social Media Monitoring — Track mentions of your brand, product, or competitors on Bluesky. Extract posts in real-time and analyze sentiment or engagement.
2. Lead Generation — Search for users interested in specific topics (e.g., "SaaS founders", "data engineers") and extract their profiles to build prospect lists.
3. Content Research & Curation — Find trending posts and discussions in your niche. Identify popular topics, hashtags, and influencers to inform your content strategy.
4. Influencer Identification — Search for high-follower accounts in your industry. Extract profile data and engagement metrics to identify potential brand ambassadors.
5. Competitive Analysis — Monitor competitor posts, engagement, and audience response. Track keyword mentions and trending discussions in your market.
6. Audience Insights — Extract profiles of followers for a user or set of users. Analyze follower demographics, interests, and engagement patterns.
7. Bot Development & Automation — Use Bluesky feed data to train chatbots or feed recommendation engines. Build automated responses or content suggestion tools.
8. Academic Research & Linguistics — Collect Bluesky posts in specific languages for linguistic analysis, sentiment research, or social network studies.
9. Crisis Monitoring — Track discussions around a crisis or incident in real-time. Extract posts, sentiment, and spread patterns for rapid response.
10. Newsletter & Report Generation — Extract top posts from your niche weekly to feed a newsletter or report. Highlight trending discussions and key opinions.
Cost Estimation
Bluesky Scraper uses PPE (Pay-Per-Event) pricing, where you pay based on actual results extracted.
Pricing Breakdown
| Action | Cost | Notes |
|---|---|---|
| First 100 results per month | FREE | Free tier — always free, no catch |
| Post extracted (PPE) | $0.002–$0.010 per post | Depends on data richness (text, images, embeds) |
| Profile extracted (PPE) | $0.001–$0.005 per profile | Basic profiles (handle, followers) cost less |
| User feed retrieval | $0.001 per post | Public feed, lightweight operation |
Example Scenarios
Scenario 1: Monthly brand monitoring (search + extract)
- 500 posts/month via search: 500 × $0.005 = $2.50/month
- (First 100 free, then 400 paid)
Scenario 2: Lead generation (search users + get profiles)
- 200 profiles extracted: 200 × $0.003 = $0.60/month
- (First 100 free, then 100 paid)
Scenario 3: Content research (get feeds)
- 1,000 posts from 5 user feeds: 1,000 × $0.001 = $1.00/month
Scenario 4: Large-scale analysis (10K results)
- 10,000 posts extracted: 10,000 × $0.005 = $50/month
- (First 100 free, then 9,900 paid)
Comparison to Competitors
| Actor | Price per 1K | Cost for 10K results |
|---|---|---|
| Our Bluesky Scraper (PPE) | $5–$50 (variable) | $50–$100 |
| george.the.developer | $1.50 (flat) | $15 |
| automation-lab | Free tier | Free → $0 |
| botflowtech | ~$1.20 (flat) | ~$12 |
Our advantage: With the free 100 results tier, small-scale monitoring costs almost nothing. Large-scale extractions cost more but pay for data quality (richer fields, faster execution).
FAQ
Q: Do I need a Bluesky account to use this actor?
A: For getProfiles and getUserFeed modes, no — public data works without login. For searchPosts and searchUsers, authentication is optional but recommended for richer, faster results. Without auth, search may return fewer results or require more retries.
Q: How do I create an app password for authentication? A: Log into your Bluesky account, go to Settings > App passwords, click "Create App Password", give it a name (e.g., "Apify"), and copy the generated 16-character password. Important: This is NOT your main account password — it's a separate credential for API access.
Q: What's the difference between "latest" and "top" sort? A: "Latest" returns posts in reverse chronological order (newest first). "Top" ranks posts by engagement (likes + replies + reposts), so you get the most-discussed posts first.
Q: Can I filter posts by language?
A: Yes! Use the language parameter with ISO 639-1 codes: en (English), es (Spanish), ja (Japanese), de (German), fr (French), etc. Leave blank to include all languages.
Q: What does "feed filter" do in getUserFeed mode?
A: The feedFilter controls what you extract from a user's feed: (1) posts_and_author_threads — everything, (2) posts_no_replies — only posts, excluding replies, (3) posts_with_media — only posts with images or videos.
Q: Why does search sometimes return fewer results than I ask for? A: Bluesky's search API limits results based on query specificity and feed size. Very specific queries (e.g., rare phrases) may return fewer matches. Also, if you don't authenticate, the API may rate-limit you after a few requests.
Q: Can I extract private/protected posts? A: No. The actor only accesses public posts visible on Bluesky's public network. Private/protected posts require explicit follow/permission from the post author.
Q: Is there a way to filter by post date range?
A: Not directly through actor parameters, but you can use query syntax: e.g., query: "web scraping since:2024-01-01" to limit to posts after a date. Check Bluesky's search syntax documentation for advanced options.
Q: What's the maximum result set I can extract? A: 10,000 results per run. For larger extractions, run the actor multiple times with different queries or pagination cursors, or schedule nightly runs.
Troubleshooting
Problem: "Authentication failed" error
Cause: Invalid Bluesky handle or app password. Fix:
- Verify your Bluesky handle (e.g.,
yourname.bsky.social, not justyourname) - Check that you've created an app password at
bsky.app/settings/app-passwords(not your main password) - Regenerate the app password and try again
- For search modes, try running without auth first to verify the actor works
Problem: Search returns very few results
Cause: Query is too specific, or you're hitting rate limits without authentication. Fix:
- Simplify your query (e.g., use single keywords instead of long phrases)
- Add authentication (
blueskyHandle+blueskyAppPassword) to unlock richer results - Use hashtags (e.g.,
#tech) or user handles (e.g.,from:jack.bsky.social) for better targeting - Check Bluesky's search syntax: supports AND/OR, quoted phrases, hashtags, user mentions
Problem: Actor times out or returns incomplete results
Cause: Network latency or API rate limiting. Fix:
- Reduce
maxItemsto a smaller batch (e.g., 100 instead of 10,000) - Run the actor again with pagination (if supported) to fetch the next batch
- Check if Bluesky's API is experiencing issues (check their status page)
- Use authentication to bypass rate limits
Problem: Images or embeds are missing from results
Cause: Some posts may not have images/embeds, or the API returns limited data for certain embed types. Fix:
- Check the raw JSON output — if
images: []orembeds: [], the post genuinely has no media - Use
feedFilter: "posts_with_media"to extract only posts with media - Some embed types (like video) may not be fully supported; check the actor logs for warnings
Limitations
-
Public data only — Cannot extract private posts, direct messages, or protected accounts. The actor respects Bluesky's access control.
-
Rate limiting — Without authentication, you may hit Bluesky's public API rate limits after 50–100 requests. Add authentication to increase your quota significantly.
-
No real-time firehose — The actor uses Bluesky's search API, not the real-time event stream (Jetstream). For live monitoring, consider using AT Protocol's WebSocket APIs directly.
-
Historical data limits — Bluesky's search is optimized for recent posts (last 30–90 days). Older posts may not be fully indexed or searchable.
-
Character encoding — Some emojis and non-ASCII characters may not render correctly in all export formats (JSON is safe, but CSV may have issues). Export to JSON for full fidelity.
-
Embed types — Some complex embeds (videos, custom feeds, bridge posts) may return limited metadata. Text posts, links, and images are fully supported.
-
Search syntax — Bluesky's search supports basic queries (keywords, hashtags, user mentions) but not advanced operators like date ranges or geolocation filters.
-
No follower list — The actor extracts profile data but not follower lists. To get a user's followers, you'd need separate follower-scraping logic.
Changelog
v1.0 (April 2024)
Initial release
- 4 scraping modes:
searchPosts,searchUsers,getProfiles,getUserFeed - Post data: text, author, likes, replies, reposts, quotes, images, embeds, language, post URL
- Profile data: handle, display name, bio, followers, follows, posts count, avatar, banner, joined date
- Optional authentication for search modes
- No authentication required for profiles and feeds
- Pagination with cursor support
- Sort by latest or top engagement (search only)
- Language filtering (search only)
- Feed filtering: posts + threads, posts only, media only (feed mode only)
- Up to 10,000 results per run
- Error handling and rate limit management
- Clean, structured JSON output
- Python, JavaScript, and LangChain examples
- PPE pricing with first 100 results free
Known issues: None reported.
Future roadmap: Follower list extraction, advanced query operators, WebSocket real-time streaming, batch scheduling.
Questions? Visit the Bluesky Scraper on Apify or check the AT Protocol documentation.
Attribution: Built with the AT Protocol SDK and Bluesky API.
License: MIT — Free to use, modify, and distribute under the MIT License.