Bluesky Feed Posts Scraper avatar

Bluesky Feed Posts Scraper

Pricing

Pay per event

Go to Apify Store
Bluesky Feed Posts Scraper

Bluesky Feed Posts Scraper

Export posts from any public Bluesky custom or algorithm feed. No login needed. Feed metadata and engagement counts via the unauthenticated AT Protocol public API.

Pricing

Pay per event

Rating

0.0

(0)

Developer

DevilScrapes

DevilScrapes

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

0

Monthly active users

a day ago

Last modified

Categories

Share

Bluesky Feed Posts Scraper

Bluesky Feed Posts Scraper

We do the dirty work so your dataset stays clean. 😈

$2.05 / 1,000 posts — Export posts from any public Bluesky custom feed or algorithm feed, including curated feeds like "Discover" and "What's Hot" and community-built feed generators, via the unauthenticated AT Protocol public AppView API. No Bluesky account. No API key. No browser automation.

This Actor calls Bluesky's public app.bsky.feed.getFeed endpoint, denormalises feed metadata into every row, and emits a flat dataset ready for direct analysis in spreadsheets, BI tools, or SQL — no joins required.

🎯 What this scrapes

Two operating modes, controlled by which input field you set:

  1. Single-feed mode — provide a feed URI (AT URI or bsky.app/profile/.../feed/... web URL) and the Actor exports every post in that feed up to the per-feed cap.
  2. Creator-discovery mode — provide a creator's Bluesky handle (e.g. bsky.app) and the Actor calls app.bsky.feed.getActorFeeds to enumerate every feed that creator publishes, then scrapes each one in turn.

For each post you receive the post body, engagement counts (likes, reposts, replies, quotes), author handle and DID, post CID, and indexing timestamp — plus the parent feed's display name, creator handle, and description denormalised onto the row so a CSV export is entirely self-contained.

FieldTypeDescription
feed_uristringAT URI of the feed generator
feed_display_namestringHuman-readable feed name (e.g. Discover)
feed_creator_handlestringBluesky handle of the feed creator
feed_descriptionstring | nullFeed description text set by the creator
post_uristringAT URI of the post
post_cidstringContent identifier (CID) of the post record
post_indexed_atstringISO 8601 datetime the post was indexed by the AppView
post_textstringBody text of the post
post_langstring | nullPrimary language code (e.g. en), if present
post_reply_countintegerNumber of replies
post_repost_countintegerNumber of reposts
post_like_countintegerNumber of likes
post_quote_countintegerNumber of quote posts
author_didstringDecentralized identifier of the post author
author_handlestringBluesky handle of the post author
author_display_namestring | nullDisplay name of the post author
scraped_atstringISO 8601 UTC datetime this row was written

🔥 Features

  • No Bluesky account required — uses the public unauthenticated AppView API at public.api.bsky.app.
  • Two operating modes: single feed URI or discover-all-feeds-by-creator via getActorFeeds.
  • Accepts either AT URIs (at://did:plc:.../app.bsky.feed.generator/whats-hot) or bsky.app web URLs — the Actor rewrites web URLs to AT URI form automatically.
  • Denormalised output — feed metadata (name, description, creator handle) on every post row, no joins needed for downstream analytics or CSV exports.
  • Cursor-based pagination with a client-side per-feed cap so you only pay for what you need.
  • Exponential backoff with Retry-After honoured for 408 / 429 / 503 responses; max 5 attempts.
  • Pure HTTP client (curl-cffi with browser fingerprint impersonation) — no browser automation, low compute footprint.
  • Pydantic v2 input validation with XOR guard: exactly one of feedUri or creatorHandle must be set.
  • Pairs with the companion bluesky-starter-pack Actor as the Bluesky Intel Suite.

💡 Use cases

  • Algorithm research — sample what posts the "Discover" / "What's Hot" algorithmic feeds actually surface across days or weeks; analyse topic drift and amplification patterns.
  • Newsroom monitoring — subscribe to curated topic feeds for breaking-news posts on specific beats, then pipe to Slack or a Google Sheet via Apify integrations.
  • Marketing intelligence — see which posts are amplified by community feeds in your niche; measure which content formats dominate each feed's engagement distribution.
  • Creator analytics — pull every post a niche feed generator surfaces and rank by like / repost / quote ratios to benchmark your own posts against feed peers.
  • Dataset bootstrap — collect labelled training data from topic-curated feeds for downstream NLP or sentiment models without manual tagging of raw timelines.
  • Competitive monitoring — track community-curated feeds that aggregate competitor announcements, support complaints, or product mentions.
  • Academic social-media research — Bluesky's public AT Protocol data is significantly more accessible than Twitter/X's API; this Actor is a low-cost entry point for longitudinal feed studies.

⚙️ How to use it

  1. Open the Actor input form.
  2. Either paste a feed AT URI or bsky.app web URL into Feed URI or URL (single-feed mode) or type a Bluesky handle into Creator handle (discovery mode). Setting both is an error; setting neither is also an error — the Actor fails fast before making any network call.
  3. Adjust Max posts per feed (default 100, maximum 5000).
  4. In discovery mode, adjust Max feeds to cap how many of the creator's feeds are scraped (default 5, maximum 50).
  5. Leave Use Apify Proxy off unless you are behind a restrictive ISP — the AT Protocol public API does not block datacenter IPs, so direct routing is faster and free.
  6. Click Start and watch the run log. Results stream into the default dataset in real time and can be downloaded as JSON, CSV, Excel, or XML via the Export button.

Finding a feed URI

Every Bluesky feed has a bsky.app URL in the form https://bsky.app/profile/<creator>/feed/<rkey>. Examples:

  • https://bsky.app/profile/bsky.app/feed/whats-hot — Bluesky's "Discover" feed
  • https://bsky.app/profile/bsky.app/feed/with-friends — "With Friends"

Paste the full URL into the Feed URI or URL field and this Actor converts it to AT URI form internally. You can also paste the raw AT URI directly if you have it.

📥 Input

FieldTypeRequiredDefaultDescription
feedUristringone-ofAT URI or bsky.app/profile/<handle>/feed/<rkey> URL of one feed
creatorHandlestringone-ofBluesky handle or DID; drives getActorFeeds discovery
maxPostsPerFeedintegerno100Max post rows emitted per feed (1–5000)
maxFeedsintegerno5Max feeds processed in discovery mode (1–50)
useProxybooleannofalseRoute requests through Apify Proxy (BUYPROXIES94952)

Exactly one of feedUri and creatorHandle must be set. Setting both, or neither, causes the Actor to exit immediately with a clear error message.

Single-feed mode example

{
"feedUri": "at://did:plc:z72i7hdynmk6r22z27h6tvur/app.bsky.feed.generator/whats-hot",
"maxPostsPerFeed": 100,
"useProxy": false
}

Creator-discovery mode example

{
"creatorHandle": "bsky.app",
"maxPostsPerFeed": 50,
"maxFeeds": 10,
"useProxy": false
}

📤 Output

One row per post. Feed metadata is denormalised onto every row so a flat CSV is self-contained.

{
"feed_uri": "at://did:plc:z72i7hdynmk6r22z27h6tvur/app.bsky.feed.generator/whats-hot",
"feed_display_name": "Discover",
"feed_creator_handle": "bsky.app",
"feed_description": "Trending content from your personal network",
"post_uri": "at://did:plc:sj5wj7libgr7omqiotenxadx/app.bsky.feed.post/3mlxmr4jyfs2s",
"post_cid": "bafyreidgimgd7v3g3pazsp5oq7ur6bvedpnwohul26mss7cbffg6bdqjkm",
"post_indexed_at": "2026-05-16T10:20:40.467Z",
"post_text": "If you never read the book or saw the movie, you missed one of the greatest Pulitzer Prize winning sagas ever written.",
"post_lang": "en",
"post_reply_count": 89,
"post_repost_count": 414,
"post_like_count": 1288,
"post_quote_count": 27,
"author_did": "did:plc:sj5wj7libgr7omqiotenxadx",
"author_handle": "louiseplease.bsky.social",
"author_display_name": "Louise",
"scraped_at": "2026-05-16T12:00:00+00:00"
}

Optional fields (feed_description, post_lang, author_display_name) are emitted as null when the API does not return them. Rows are never dropped for missing optional fields.

Export formats

After a run completes, click Export in the Apify Console to download:

  • JSON — full fidelity, all fields, newline-delimited
  • CSV — flat, one row per post, all columns including denormalised feed metadata
  • Excel.xlsx via the Apify dataset converter
  • XML — structured per-item

All formats are available via the Apify API: GET /datasets/{id}/items?format=csv&clean=true.

💰 Pricing

Pay-Per-Event (PPE) — you pay only for what you use:

EventPrice (USD)When
actor-start$0.05Once per run, at boot
result-row$0.002Per post row written to the dataset

Example costs

Posts scrapedActor startsTotal cost
1001$0.25
5001$1.05
1,0001$2.05
5,0001$10.05

At the maximum single-run input (50 feeds × 100 posts = 5,000 rows) a single run costs around $10.05.

This rate is consistent with the companion Actor bluesky-starter-pack so the Bluesky Intel Suite has uniform pricing across both tools.

🚧 Limitations

  • Private or access-restricted feeds are not exposed by the public AppView API — only feeds whose data is visible at public.api.bsky.app can be scraped.
  • Global feed discovery by keyword is not supported — Bluesky's getPopularFeedGenerators endpoint returns MethodNotImplemented on the public AppView. Use creator-discovery mode (creatorHandle) to enumerate one creator's feeds at a time.
  • Post images, embeds, and quoted-post bodies are not extracted — only the plain-text body (post_text) is captured. Image ALT text, external link cards, and quoted-post content are outside the current schema.
  • Reply thread expansion is out of scope — only the top-level post-level row is emitted. Threaded context (parent/root posts) would require additional getPostThread calls and is not wired in this version.
  • The maxPostsPerFeed cap is client-side — the Actor paginates until it has collected the cap or the API cursor is exhausted. If a feed has fewer posts than the cap, fewer rows are returned. This is expected behaviour, not a failure.
  • The Apify FREE tier retains run-scoped storage for 7 days only. For longer retention, export your dataset immediately after the run completes or upgrade to a paid Apify plan.
  • Rate limiting — the public AppView may rate-limit high-frequency requests. The Actor retries on 429 with exponential backoff, but very large scrapes (tens of thousands of rows) may require splitting into multiple runs.

Tips for best results

  • Use AT URIs when possible. The Actor resolves bsky.app web URLs on the fly (one extra getProfile API call), which adds latency. Pasting the AT URI directly skips this step.
  • Cap maxPostsPerFeed to what you actually need. Feeds like "Discover" can have hundreds of posts; setting a lower cap keeps cost and runtime predictable.
  • Prefer creator-discovery mode for bulk collection. If you want posts from all feeds by a creator like bsky.app, use creatorHandle: "bsky.app" rather than multiple single-feed runs — the Actor handles pagination for each feed sequentially.
  • Schedule recurring runs to track feed evolution. Set up an Apify Schedule to run this Actor daily or weekly on a specific feed. Use a named dataset (via Apify API datasetName parameter at run time) to accumulate rows across runs.
  • Use the CSV export for spreadsheet workflows. Because feed metadata is denormalised onto every row, no pivot or VLOOKUP is needed — the CSV is immediately usable in Google Sheets or Excel.
  • Combine with bluesky-starter-pack. If you want both the posts from a community feed and the member list of the Starter Pack that drives that community, run both Actors and join on author_handle.

Integrations

This Actor works natively with the Apify platform's built-in connectors:

  • Apify API — trigger runs programmatically, poll for status, and fetch dataset items via REST. Full OpenAPI spec at https://docs.apify.com/api/v2.
  • Webhooks — configure a webhook to POST the run result to your endpoint as soon as the Actor finishes.
  • Apify Schedules — run this Actor on a cron schedule (e.g. daily at 08:00 UTC) to keep a feed dataset fresh.
  • Make (formerly Integromat) — use the Apify Make module to trigger runs and route results to Google Sheets, Airtable, Slack, or anywhere Make connects.
  • Zapier — Apify's Zapier integration triggers on run completion and passes dataset items downstream.
  • n8n — use the HTTP Request node with the Apify REST API for fully self-hosted automation pipelines.

❓ FAQ

Do I need a Bluesky account?

No. The AT Protocol public AppView at public.api.bsky.app/xrpc/ is unauthenticated by design — every endpoint this Actor calls is open to anyone without a login or API key.

What is a feed URI?

An AT URI like at://did:plc:z72i7hdynmk6r22z27h6tvur/app.bsky.feed.generator/whats-hot. The part after at:// before the second slash is a DID — a decentralized identifier. The collection is always app.bsky.feed.generator. The final segment is the rkey (record key) that identifies the specific feed. You can also just paste a bsky.app web URL — the Actor converts it automatically.

How do I scrape all feeds published by a single creator?

Set the creatorHandle input to the creator's Bluesky handle (e.g. bsky.app) and leave feedUri blank. The Actor calls app.bsky.feed.getActorFeeds and scrapes each feed in turn, up to the maxFeeds cap.

Can I scrape Bluesky's built-in feeds like "Discover" or "What's Hot"?

Yes. Those are published by the bsky.app account. Use the feed URI at://did:plc:z72i7hdynmk6r22z27h6tvur/app.bsky.feed.generator/whats-hot for "Discover" / "What's Hot", or paste the bsky.app/profile/bsky.app/feed/whats-hot URL. You can also use creator-discovery mode with creatorHandle: "bsky.app" to get all feeds that account publishes.

Why is useProxy off by default?

The AT Protocol public API does not block datacenter IPs, so direct routing is faster and free. Enable proxy only if you are behind a restrictive ISP or a firewall that blocks outbound connections to public.api.bsky.app.

Is scraping public Bluesky feeds legal?

The AT Protocol is an open, federated protocol. public.api.bsky.app is explicitly unauthenticated and publicly accessible without login. The Bluesky Terms of Service permit accessing public data programmatically as long as you do not impersonate users or violate the AT Protocol's data-portability principles. Always verify the current Terms of Service at bsky.social/about/support/tos and your local jurisdiction's data-protection rules before using scraped data for commercial purposes.

How do I export to Google Sheets?

After the run finishes, click Export → CSV in the Apify Console and import the file into Google Sheets. Alternatively, use the Apify API URL shown in the run's Output tab to import data directly via =IMPORTDATA("...") in Sheets.

What happens if a feed is empty?

The Actor exits with a non-zero status code and a clear status message: "No posts emitted — feed may be empty, private, or the URI invalid." The dataset will have zero rows. Check that the feed URI is correct and the feed is publicly visible on bsky.app.

  • Bluesky Starter Pack Scraper — companion Actor in the Bluesky Intel Suite; exports full member lists from any public Bluesky Starter Pack. Pair with this Actor to cross-reference feed posts with community membership data.

💬 Your feedback

Found a bug, hit a rate limit, or need a new field on the output row? Open an issue on the Actor's Apify Store page or contact the Devil Scrapes team at apify.com/DevilScrapes. We ship updates within days of validated reports.