Bluesky & Mastodon Scraper - Decentralized Social Media avatar
Bluesky & Mastodon Scraper - Decentralized Social Media

Pricing

Pay per event

Go to Apify Store
Bluesky & Mastodon Scraper - Decentralized Social Media

Bluesky & Mastodon Scraper - Decentralized Social Media

Extract and monitor posts from Bluesky (AT Protocol) and Mastodon (Fediverse). The most comprehensive social media scraper for decentralized networks - perfect for social listening, brand monitoring, market research, sentiment analysis, and AI training data collection.

Pricing

Pay per event

Rating

5.0

(2)

Developer

BarriereFix

BarriereFix

Maintained by Community

Actor stats

0

Bookmarked

4

Total users

1

Monthly active users

14 days ago

Last modified

Share

Bluesky & Mastodon Scraper API - Decentralized Social Media Data Aggregator

Extract and monitor posts from Bluesky (AT Protocol) and Mastodon (Fediverse) with a unified, normalized JSON API. The most comprehensive social media scraper for decentralized networks - perfect for social listening, brand monitoring, market research, sentiment analysis, and AI training data collection.

๐Ÿ” Search by keywords โ€ข ๐Ÿ‘ฅ Track specific users โ€ข ๐Ÿ“Š Unified data format โ€ข ๐Ÿช Real-time webhooks โ€ข ๐Ÿ’ฐ Pay-per-post pricing

๐Ÿš€ Features

  • Multi-Platform Support: Scrape Bluesky and Mastodon simultaneously
  • Keyword Search: Find posts mentioning specific terms or phrases
  • Handle Tracking: Monitor specific users across platforms
  • Date Range Filtering: Historical and real-time post collection
  • Unified Schema: Normalized output format across all platforms
  • Intelligent Deduplication: Automatic duplicate detection and removal
  • Real-Time Webhooks: Send posts to your endpoints as they're discovered
  • Language Filtering: Filter posts by language (BCP-47 codes)
  • Pay-Per-Event Pricing: Only pay for posts collected, not compute time
  • No Authentication Required: Works with public data (optional auth for higher limits)

๐Ÿ“Š Supported Platforms

Bluesky (AT Protocol)

  • Full keyword search via searchActors workaround
  • User feed tracking
  • Quote posts, replies, reposts, likes
  • Media attachments (images, videos, GIFs)
  • Rich metadata (DIDs, handles, timestamps)

Mastodon (Fediverse)

  • Multi-instance support (mastodon.social, mas.to, fosstodon.org, etc.)
  • Full keyword search across instances
  • User timeline tracking
  • Boosts, replies, favorites
  • Media attachments with alt text
  • Instance-specific data

๐Ÿ’ก Use Cases

  • Social Listening: Track brand mentions and industry keywords
  • Market Research: Analyze trends and conversations in your niche
  • Sentiment Analysis: Collect data for AI/ML sentiment models
  • Brand Monitoring: Monitor your company and competitors
  • Academic Research: Study social media behavior and network effects
  • Content Discovery: Find engaging content for curation
  • Influencer Tracking: Monitor key voices in your industry

๐ŸŽฏ Quick Start

```json { "platforms": ["bluesky", "mastodon"], "query": "artificial intelligence", "maxItems": 100, "languages": ["en"] } ```

Example 2: Track specific users

```json { "platforms": ["bluesky", "mastodon"], "handles": ["jay.bsky.social", "@gargron@mastodon.social"], "maxItems": 500 } ```

Example 3: Historical search with date range

```json { "platforms": ["bluesky"], "query": "climate change", "since": "2025-09-01T00:00:00Z", "until": "2025-10-01T00:00:00Z", "maxItems": 1000 } ```

Example 4: Real-time monitoring with webhooks

```json { "platforms": ["bluesky", "mastodon"], "query": "crypto", "emitWebhooks": true, "webhooks": [ { "url": "https://your-api.com/webhook", "headers": {"Authorization": "Bearer YOUR_TOKEN"}, "mode": "per_item", "platforms": ["bluesky"] } ] } ```

๐Ÿ“ฅ Input Parameters

ParameterTypeRequiredDescription
`platforms`Arrayโœ…Platforms to scrape: `["bluesky", "mastodon"]`
`query`StringโŒKeywords to search for
`handles`ArrayโŒSpecific user handles to track
`since`StringโŒStart date (ISO 8601)
`until`StringโŒEnd date (ISO 8601)
`maxItems`IntegerโŒMax posts to collect (default: 1000)
`languages`ArrayโŒLanguage codes (e.g., `["en", "de"]`)
`includeReplies`BooleanโŒInclude reply posts (default: false)
`emitWebhooks`BooleanโŒEnable webhook delivery
`webhooks`ArrayโŒWebhook endpoint configurations
`blueskyCredentials`ObjectโŒOptional auth for higher rate limits
`mastodonInstances`ArrayโŒSpecific Mastodon instances to search
`maxConcurrency`IntegerโŒConcurrent requests (default: 5)
`dryRun`BooleanโŒTest mode without storing data

Note: You must provide either `query` OR `handles` (or both).

๐Ÿ“ค Output Schema

Each post is normalized to a unified format:

```json { "platform": "bluesky", "postId": "at://did:plc:xyz/app.bsky.feed.post/3kff...", "url": "https://bsky.app/profile/jay.bsky.social/post/3kff...", "text": "Building the future of social media...", "language": "en", "author": { "handle": "jay.bsky.social", "did": "did:plc:xyz", "displayName": "Jay Graber", "profileUrl": "https://bsky.app/profile/jay.bsky.social" }, "createdAt": "2025-10-08T10:30:00Z", "metrics": { "replies": 42, "reposts": 128, "likes": 567, "quotes": 23 }, "entities": { "hashtags": ["decentralization", "atproto"], "mentions": ["@handle1.bsky.social"] }, "media": [ { "type": "image", "url": "https://cdn.bsky.app/...", "alt": "Screenshot of the app" } ], "source": { "instance": null }, "references": { "replyTo": null, "quotedPost": "at://did:plc:..." }, "ingest_meta": { "first_seen_at": "2025-10-08T11:00:00Z", "adapter_version": "1.0.0" } } ```

๐Ÿ” Authentication

Bluesky (Optional)

Works without authentication for public data. For higher rate limits: ```json { "blueskyCredentials": { "identifier": "your-handle.bsky.social", "password": "your-app-password" } } ``` Get app password: Settings โ†’ App Passwords โ†’ Add App Password

Mastodon

No authentication required for public posts.

๐ŸŒ Mastodon Instance Support

Auto-Detection

The actor automatically detects Mastodon instances from handles: ```json { "handles": ["@user@mastodon.social", "@dev@fosstodon.org"] } ```

Manual Configuration

Specify instances explicitly: ```json { "mastodonInstances": ["mastodon.social", "mas.to", "fosstodon.org"] } ```

๐Ÿช Webhooks

Send posts to your endpoints in real-time:

```json { "emitWebhooks": true, "webhooks": [ { "url": "https://api.example.com/posts", "headers": { "Authorization": "Bearer YOUR_TOKEN", "Content-Type": "application/json" }, "secret": "shared-secret-key", "mode": "per_item", "platforms": ["bluesky", "mastodon"] } ] } ```

Webhook Modes:

  • `per_item`: Send each post individually
  • `batch`: Send posts in batches (coming soon)

๐Ÿ’ฐ Pricing

Pay-Per-Event Model: Only pay for posts you collect

  • $0.002 per post ($2 per 1,000 posts)
  • No compute time charges
  • No setup fees
  • Cancel anytime

Examples:

  • 100 posts = $0.20
  • 1,000 posts = $2.00
  • 10,000 posts = $20.00
  • 100,000 posts = $200.00

Simple, transparent pricing - you only pay for what you use.

๐Ÿ“… Scheduling

Run every hour

``` 0 * * * * ```

Run daily at midnight

``` 0 0 * * * ```

Run every 15 minutes

``` */15 * * * * ```

๐Ÿ”„ Deduplication

The actor automatically:

  • Tracks seen posts with state management
  • Skips duplicates across runs
  • Cleans up old state entries (30+ days)

โšก Performance

  • Speed: ~100-200 posts/minute per platform
  • Rate Limits: Respects platform rate limits automatically
  • Concurrency: Configurable (1-20 concurrent requests)
  • Memory: ~256MB typical, ~512MB for large runs

๐Ÿ› ๏ธ Advanced Configuration

Language Filtering

```json { "languages": ["en", "de", "ja", "es"] } ```

Date Range

```json { "since": "2025-09-01T00:00:00Z", "until": "2025-10-01T00:00:00Z" } ```

Include Replies

```json { "includeReplies": true } ```

Dry Run (Testing)

```json { "dryRun": true } ```

๐Ÿ“Š Dataset Views

The actor provides three pre-configured views in Apify Console:

  1. Overview: All posts with key metrics
  2. By Platform: Posts grouped by source
  3. Top Engagement: Sorted by likes/reposts

๐Ÿ” Search Tips

  • Use specific terms: "machine learning" vs "AI"
  • Combine keywords: "climate change policy"
  • Use quotes for exact phrases (Bluesky only)

Handle Formats

  • Bluesky: `jay.bsky.social` or `handle.domain.com`
  • Mastodon: `@username@instance.social` or `instance.social/@username`

Date Ranges

  • Use ISO 8601 format: `2025-10-08T10:30:00Z`
  • Timezone: Always UTC (Z suffix)

โš ๏ธ Limitations

  • Bluesky: Keyword search uses searchActors workaround (may be slower than native search)
  • Mastodon: Search quality depends on instance search capabilities
  • Rate Limits: Public APIs have rate limits (authentication increases limits)
  • Historical Data: Availability depends on platform retention policies

๐Ÿ†˜ Support

  • Email: kontakt@barrierefix.de
  • Issues: Report bugs or request features
  • Documentation: Full API docs in source code

๐Ÿ“œ License

MIT License - Free to use commercially and privately

๐Ÿท๏ธ Tags

`bluesky` `mastodon` `at-protocol` `fediverse` `social-media` `scraper` `aggregator` `decentralized` `web3` `social-listening` `brand-monitoring` `sentiment-analysis` `market-research` `data-collection` `apify`


๐Ÿ”— Explore More of Our Actors

๐Ÿ’ฌ Social Media & Community

ActorDescription
Reddit Scraper ProMonitor subreddits and track keywords with sentiment analysis
Discord Scraper ProExtract Discord messages and chat history for community insights
YouTube Comments HarvesterComprehensive YouTube comments scraper with channel-wide enumeration
YouTube Contact ScraperExtract YouTube channel contact information for outreach
YouTube Shorts ScraperScrape YouTube Shorts for viral content research

๐Ÿข Business Intelligence

ActorDescription
Indeed Salary AnalyzerGet salary data for compensation benchmarking and HR analytics
Crunchbase ScraperExtract company data and funding information for business intelligence
Northdata ScraperExtract German company data from Northdata for business research
Shopify Store IntelligenceAnalyze Shopify stores for competitive intelligence and market research
Apify Store RadarMonitor Apify Store actors for market intelligence


Built by Barrierefix | Powered by Apify