Bluesky Feed Posts Scraper
Pricing
Pay per event
Bluesky Feed Posts Scraper
Export posts from any public Bluesky custom or algorithm feed. No login needed. Feed metadata and engagement counts via the unauthenticated AT Protocol public API.
Pricing
Pay per event
Rating
0.0
(0)
Developer
DevilScrapes
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
0
Monthly active users
a day ago
Last modified
Categories
Share
Bluesky Feed Posts Scraper
We do the dirty work so your dataset stays clean. 😈
$2.05 / 1,000 posts — Export posts from any public Bluesky custom feed or algorithm feed, including curated feeds like "Discover" and "What's Hot" and community-built feed generators, via the unauthenticated AT Protocol public AppView API. No Bluesky account. No API key. No browser automation.
This Actor calls Bluesky's public app.bsky.feed.getFeed endpoint, denormalises feed metadata into every row, and emits a flat dataset ready for direct analysis in spreadsheets, BI tools, or SQL — no joins required.
🎯 What this scrapes
Two operating modes, controlled by which input field you set:
- Single-feed mode — provide a feed URI (AT URI or
bsky.app/profile/.../feed/...web URL) and the Actor exports every post in that feed up to the per-feed cap. - Creator-discovery mode — provide a creator's Bluesky handle (e.g.
bsky.app) and the Actor callsapp.bsky.feed.getActorFeedsto enumerate every feed that creator publishes, then scrapes each one in turn.
For each post you receive the post body, engagement counts (likes, reposts, replies, quotes), author handle and DID, post CID, and indexing timestamp — plus the parent feed's display name, creator handle, and description denormalised onto the row so a CSV export is entirely self-contained.
| Field | Type | Description |
|---|---|---|
feed_uri | string | AT URI of the feed generator |
feed_display_name | string | Human-readable feed name (e.g. Discover) |
feed_creator_handle | string | Bluesky handle of the feed creator |
feed_description | string | null | Feed description text set by the creator |
post_uri | string | AT URI of the post |
post_cid | string | Content identifier (CID) of the post record |
post_indexed_at | string | ISO 8601 datetime the post was indexed by the AppView |
post_text | string | Body text of the post |
post_lang | string | null | Primary language code (e.g. en), if present |
post_reply_count | integer | Number of replies |
post_repost_count | integer | Number of reposts |
post_like_count | integer | Number of likes |
post_quote_count | integer | Number of quote posts |
author_did | string | Decentralized identifier of the post author |
author_handle | string | Bluesky handle of the post author |
author_display_name | string | null | Display name of the post author |
scraped_at | string | ISO 8601 UTC datetime this row was written |
🔥 Features
- No Bluesky account required — uses the public unauthenticated AppView API at
public.api.bsky.app. - Two operating modes: single feed URI or discover-all-feeds-by-creator via
getActorFeeds. - Accepts either AT URIs (
at://did:plc:.../app.bsky.feed.generator/whats-hot) orbsky.appweb URLs — the Actor rewrites web URLs to AT URI form automatically. - Denormalised output — feed metadata (name, description, creator handle) on every post row, no joins needed for downstream analytics or CSV exports.
- Cursor-based pagination with a client-side per-feed cap so you only pay for what you need.
- Exponential backoff with
Retry-Afterhonoured for408 / 429 / 503responses; max 5 attempts. - Pure HTTP client (
curl-cffiwith browser fingerprint impersonation) — no browser automation, low compute footprint. - Pydantic v2 input validation with XOR guard: exactly one of
feedUriorcreatorHandlemust be set. - Pairs with the companion
bluesky-starter-packActor as the Bluesky Intel Suite.
💡 Use cases
- Algorithm research — sample what posts the "Discover" / "What's Hot" algorithmic feeds actually surface across days or weeks; analyse topic drift and amplification patterns.
- Newsroom monitoring — subscribe to curated topic feeds for breaking-news posts on specific beats, then pipe to Slack or a Google Sheet via Apify integrations.
- Marketing intelligence — see which posts are amplified by community feeds in your niche; measure which content formats dominate each feed's engagement distribution.
- Creator analytics — pull every post a niche feed generator surfaces and rank by like / repost / quote ratios to benchmark your own posts against feed peers.
- Dataset bootstrap — collect labelled training data from topic-curated feeds for downstream NLP or sentiment models without manual tagging of raw timelines.
- Competitive monitoring — track community-curated feeds that aggregate competitor announcements, support complaints, or product mentions.
- Academic social-media research — Bluesky's public AT Protocol data is significantly more accessible than Twitter/X's API; this Actor is a low-cost entry point for longitudinal feed studies.
⚙️ How to use it
- Open the Actor input form.
- Either paste a feed AT URI or
bsky.appweb URL into Feed URI or URL (single-feed mode) or type a Bluesky handle into Creator handle (discovery mode). Setting both is an error; setting neither is also an error — the Actor fails fast before making any network call. - Adjust Max posts per feed (default 100, maximum 5000).
- In discovery mode, adjust Max feeds to cap how many of the creator's feeds are scraped (default 5, maximum 50).
- Leave Use Apify Proxy off unless you are behind a restrictive ISP — the AT Protocol public API does not block datacenter IPs, so direct routing is faster and free.
- Click Start and watch the run log. Results stream into the default dataset in real time and can be downloaded as JSON, CSV, Excel, or XML via the Export button.
Finding a feed URI
Every Bluesky feed has a bsky.app URL in the form https://bsky.app/profile/<creator>/feed/<rkey>. Examples:
https://bsky.app/profile/bsky.app/feed/whats-hot— Bluesky's "Discover" feedhttps://bsky.app/profile/bsky.app/feed/with-friends— "With Friends"
Paste the full URL into the Feed URI or URL field and this Actor converts it to AT URI form internally. You can also paste the raw AT URI directly if you have it.
📥 Input
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
feedUri | string | one-of | — | AT URI or bsky.app/profile/<handle>/feed/<rkey> URL of one feed |
creatorHandle | string | one-of | — | Bluesky handle or DID; drives getActorFeeds discovery |
maxPostsPerFeed | integer | no | 100 | Max post rows emitted per feed (1–5000) |
maxFeeds | integer | no | 5 | Max feeds processed in discovery mode (1–50) |
useProxy | boolean | no | false | Route requests through Apify Proxy (BUYPROXIES94952) |
Exactly one of feedUri and creatorHandle must be set. Setting both, or neither, causes the Actor to exit immediately with a clear error message.
Single-feed mode example
{"feedUri": "at://did:plc:z72i7hdynmk6r22z27h6tvur/app.bsky.feed.generator/whats-hot","maxPostsPerFeed": 100,"useProxy": false}
Creator-discovery mode example
{"creatorHandle": "bsky.app","maxPostsPerFeed": 50,"maxFeeds": 10,"useProxy": false}
📤 Output
One row per post. Feed metadata is denormalised onto every row so a flat CSV is self-contained.
{"feed_uri": "at://did:plc:z72i7hdynmk6r22z27h6tvur/app.bsky.feed.generator/whats-hot","feed_display_name": "Discover","feed_creator_handle": "bsky.app","feed_description": "Trending content from your personal network","post_uri": "at://did:plc:sj5wj7libgr7omqiotenxadx/app.bsky.feed.post/3mlxmr4jyfs2s","post_cid": "bafyreidgimgd7v3g3pazsp5oq7ur6bvedpnwohul26mss7cbffg6bdqjkm","post_indexed_at": "2026-05-16T10:20:40.467Z","post_text": "If you never read the book or saw the movie, you missed one of the greatest Pulitzer Prize winning sagas ever written.","post_lang": "en","post_reply_count": 89,"post_repost_count": 414,"post_like_count": 1288,"post_quote_count": 27,"author_did": "did:plc:sj5wj7libgr7omqiotenxadx","author_handle": "louiseplease.bsky.social","author_display_name": "Louise","scraped_at": "2026-05-16T12:00:00+00:00"}
Optional fields (feed_description, post_lang, author_display_name) are emitted as null when the API does not return them. Rows are never dropped for missing optional fields.
Export formats
After a run completes, click Export in the Apify Console to download:
- JSON — full fidelity, all fields, newline-delimited
- CSV — flat, one row per post, all columns including denormalised feed metadata
- Excel —
.xlsxvia the Apify dataset converter - XML — structured per-item
All formats are available via the Apify API: GET /datasets/{id}/items?format=csv&clean=true.
💰 Pricing
Pay-Per-Event (PPE) — you pay only for what you use:
| Event | Price (USD) | When |
|---|---|---|
actor-start | $0.05 | Once per run, at boot |
result-row | $0.002 | Per post row written to the dataset |
Example costs
| Posts scraped | Actor starts | Total cost |
|---|---|---|
| 100 | 1 | $0.25 |
| 500 | 1 | $1.05 |
| 1,000 | 1 | $2.05 |
| 5,000 | 1 | $10.05 |
At the maximum single-run input (50 feeds × 100 posts = 5,000 rows) a single run costs around $10.05.
This rate is consistent with the companion Actor bluesky-starter-pack so the Bluesky Intel Suite has uniform pricing across both tools.
🚧 Limitations
- Private or access-restricted feeds are not exposed by the public AppView API — only feeds whose data is visible at
public.api.bsky.appcan be scraped. - Global feed discovery by keyword is not supported — Bluesky's
getPopularFeedGeneratorsendpoint returnsMethodNotImplementedon the public AppView. Use creator-discovery mode (creatorHandle) to enumerate one creator's feeds at a time. - Post images, embeds, and quoted-post bodies are not extracted — only the plain-text body (
post_text) is captured. Image ALT text, external link cards, and quoted-post content are outside the current schema. - Reply thread expansion is out of scope — only the top-level post-level row is emitted. Threaded context (parent/root posts) would require additional
getPostThreadcalls and is not wired in this version. - The
maxPostsPerFeedcap is client-side — the Actor paginates until it has collected the cap or the API cursor is exhausted. If a feed has fewer posts than the cap, fewer rows are returned. This is expected behaviour, not a failure. - The Apify FREE tier retains run-scoped storage for 7 days only. For longer retention, export your dataset immediately after the run completes or upgrade to a paid Apify plan.
- Rate limiting — the public AppView may rate-limit high-frequency requests. The Actor retries on
429with exponential backoff, but very large scrapes (tens of thousands of rows) may require splitting into multiple runs.
Tips for best results
- Use AT URIs when possible. The Actor resolves
bsky.appweb URLs on the fly (one extragetProfileAPI call), which adds latency. Pasting the AT URI directly skips this step. - Cap
maxPostsPerFeedto what you actually need. Feeds like "Discover" can have hundreds of posts; setting a lower cap keeps cost and runtime predictable. - Prefer creator-discovery mode for bulk collection. If you want posts from all feeds by a creator like
bsky.app, usecreatorHandle: "bsky.app"rather than multiple single-feed runs — the Actor handles pagination for each feed sequentially. - Schedule recurring runs to track feed evolution. Set up an Apify Schedule to run this Actor daily or weekly on a specific feed. Use a named dataset (via Apify API
datasetNameparameter at run time) to accumulate rows across runs. - Use the CSV export for spreadsheet workflows. Because feed metadata is denormalised onto every row, no pivot or VLOOKUP is needed — the CSV is immediately usable in Google Sheets or Excel.
- Combine with
bluesky-starter-pack. If you want both the posts from a community feed and the member list of the Starter Pack that drives that community, run both Actors and join onauthor_handle.
Integrations
This Actor works natively with the Apify platform's built-in connectors:
- Apify API — trigger runs programmatically, poll for status, and fetch dataset items via REST. Full OpenAPI spec at
https://docs.apify.com/api/v2. - Webhooks — configure a webhook to POST the run result to your endpoint as soon as the Actor finishes.
- Apify Schedules — run this Actor on a cron schedule (e.g. daily at 08:00 UTC) to keep a feed dataset fresh.
- Make (formerly Integromat) — use the Apify Make module to trigger runs and route results to Google Sheets, Airtable, Slack, or anywhere Make connects.
- Zapier — Apify's Zapier integration triggers on run completion and passes dataset items downstream.
- n8n — use the HTTP Request node with the Apify REST API for fully self-hosted automation pipelines.
❓ FAQ
Do I need a Bluesky account?
No. The AT Protocol public AppView at public.api.bsky.app/xrpc/ is unauthenticated by design — every endpoint this Actor calls is open to anyone without a login or API key.
What is a feed URI?
An AT URI like at://did:plc:z72i7hdynmk6r22z27h6tvur/app.bsky.feed.generator/whats-hot. The part after at:// before the second slash is a DID — a decentralized identifier. The collection is always app.bsky.feed.generator. The final segment is the rkey (record key) that identifies the specific feed. You can also just paste a bsky.app web URL — the Actor converts it automatically.
How do I scrape all feeds published by a single creator?
Set the creatorHandle input to the creator's Bluesky handle (e.g. bsky.app) and leave feedUri blank. The Actor calls app.bsky.feed.getActorFeeds and scrapes each feed in turn, up to the maxFeeds cap.
Can I scrape Bluesky's built-in feeds like "Discover" or "What's Hot"?
Yes. Those are published by the bsky.app account. Use the feed URI at://did:plc:z72i7hdynmk6r22z27h6tvur/app.bsky.feed.generator/whats-hot for "Discover" / "What's Hot", or paste the bsky.app/profile/bsky.app/feed/whats-hot URL. You can also use creator-discovery mode with creatorHandle: "bsky.app" to get all feeds that account publishes.
Why is useProxy off by default?
The AT Protocol public API does not block datacenter IPs, so direct routing is faster and free. Enable proxy only if you are behind a restrictive ISP or a firewall that blocks outbound connections to public.api.bsky.app.
Is scraping public Bluesky feeds legal?
The AT Protocol is an open, federated protocol. public.api.bsky.app is explicitly unauthenticated and publicly accessible without login. The Bluesky Terms of Service permit accessing public data programmatically as long as you do not impersonate users or violate the AT Protocol's data-portability principles. Always verify the current Terms of Service at bsky.social/about/support/tos and your local jurisdiction's data-protection rules before using scraped data for commercial purposes.
How do I export to Google Sheets?
After the run finishes, click Export → CSV in the Apify Console and import the file into Google Sheets. Alternatively, use the Apify API URL shown in the run's Output tab to import data directly via =IMPORTDATA("...") in Sheets.
What happens if a feed is empty?
The Actor exits with a non-zero status code and a clear status message: "No posts emitted — feed may be empty, private, or the URI invalid." The dataset will have zero rows. Check that the feed URI is correct and the feed is publicly visible on bsky.app.
Related Actors
- Bluesky Starter Pack Scraper — companion Actor in the Bluesky Intel Suite; exports full member lists from any public Bluesky Starter Pack. Pair with this Actor to cross-reference feed posts with community membership data.
💬 Your feedback
Found a bug, hit a rate limit, or need a new field on the output row? Open an issue on the Actor's Apify Store page or contact the Devil Scrapes team at apify.com/DevilScrapes. We ship updates within days of validated reports.