Pricing

from $2.91 / 1,000 listings

Substack Scraper — Publication Posts | $1.50/1K

Scrape any Substack newsletter's post list via the official Substack public API. No auth, no proxy. Title, subtitle, date, free/paid audience, type, reactions, restacks, podcast_url. Podcast posts billed at premium rate ($2.50/1K). Pay per post.

Pricing

from $2.91 / 1,000 listings

Rating

0.0

(0)

Developer

Vitalii Bondarev

Actor stats

Bookmarked

Total users

Monthly active users

23 days ago

Last modified

Substack Scraper — Publication Posts & Metadata | $1.50/1K | No Auth, Official API

For newsletter researchers, content agencies, competitive intelligence teams, and AI pipelines that need Substack content at scale.

Pricing: $1.50 per 1,000 post records · $2.50 per 1,000 podcast posts (posts where type=podcast and podcast_url is present — audio file URL included). No monthly fees. No authentication required.

Scrape any Substack publication's post listing via the official public REST API — no authentication, no proxy, no browser required. Returns structured metadata for every post: title, subtitle, publish date, audience (free vs paid), post type, reactions, comments, restacks, cover image, wordcount, and canonical URL.

Pay per post returned (PPE pricing).

What you get

Field	Description
`title`	Post title
`subtitle`	Deck / tagline
`canonical_url`	Full URL to the post
`slug`	URL slug
`post_date`	Published timestamp (ISO 8601 UTC)
`audience`	`everyone` (free) or `only_paid` (paywalled)
`type`	`newsletter`, `podcast`, `video`, etc.
`podcast_url`	Audio file URL (podcast posts only)
`reactions_count`	Total hearts
`comment_count`	Number of comments
`restacks`	Number of Substack reposts
`cover_image`	Cover image URL
`wordcount`	Approximate word count
`publication_slug`	Short publication identifier
`parse_confidence`	Data quality score 0–1
`warnings`	List of missing-field codes

Note: Post bodies (full text / HTML) are not returned by the listing API. Paywalled posts return metadata only — body content requires a paid subscription and is not scraped.

Pricing example

$1.50 per 1,000 newsletter posts · $2.50 per 1,000 podcast posts (posts where type=podcast and podcast_url is present). A 500-post archive = $0.75. Scraping 5 newsletters × 200 posts = $1.50. No per-run fee.

Sample output

{
  "title": "The Collapse of Web Scraping",
  "subtitle": "Why every major site now requires a browser — and what to do about it",
  "canonical_url": "https://on.substack.com/p/the-collapse-of-web-scraping",
  "post_date": "2026-05-18T14:00:00Z",
  "audience": "everyone",
  "type": "newsletter",
  "podcast_url": null,
  "reactions_count": 312,
  "comment_count": 47,
  "restacks": 89,
  "wordcount": 2100,
  "publication_slug": "on",
  "parse_confidence": 1.0,
  "scraped_at": "2026-06-05T09:00:00Z"
}

Frequently asked questions

Do I need a Substack account or API key? No. The actor uses the official Substack public listing API — no authentication required for public publications.

Do I need a proxy? No. The Substack API is open. Zero proxy cost to you.

What formats does the output come in? JSON, CSV, and Excel via the Apify dataset. Native integration with n8n, Make, Zapier.

What if the publication returns a 403 or empty results? Some publications restrict their API access (e.g. bankless.substack.com). The actor logs the error and exits cleanly — it pushes nothing and charges nothing. Switch to a different slug and try again.

Input

Parameter	Type	Default	Description
`publication`	string	`on`	Slug (e.g. `on`), full URL (e.g. `https://on.substack.com`), or custom domain
`maxPosts`	integer	`100`	Max posts to return. `0` = no limit (fetch entire archive)

Publication examples

on                          →  on.substack.com
bankless                    →  bankless.substack.com
https://on.substack.com     →  on.substack.com (same)
https://platformer.news     →  custom domain (supported)

How it works

Uses the Substack per-publication public REST endpoint:

GET https://<publication>.substack.com/api/v1/posts?offset=0&limit=50

Paginates via offset until all posts are retrieved or maxPosts is reached. No auth headers needed. No proxy required for public publications.

Our edge over incumbents

Reliable pagination — offset-based, not page-based; survives large archives.
Reactions normalized — raw reactions dict summed to reactions_count (compatible with publications adding new reaction types).
parse_confidence score — every record includes a data-quality score and warnings list so you can detect schema drift without re-running.
restacks field — Substack's repost count, absent from most competitor actors.
Custom domain support — not just *.substack.com slugs.

Limitations

Post bodies not included — listing API returns metadata only. Full HTML/Markdown bodies require the individual post endpoint (not in this actor's scope).
Paywalled posts — metadata is returned for all posts, but body content is not accessible without a paid subscription.
Publications blocking API access — some publications (e.g. bankless.substack.com) return 403; this is a publication-level restriction, not a Substack platform restriction.

Competitor comparison

	This scraper	Other Substack actors
Official Substack API	✓	partial
`restacks` field	✓	✗
Custom domain support	✓	✗
`podcast_url` field	✓	✗
parse_confidence on every record	✓	✗
No auth required	✓	✓

Podcast use case

The podcast_url field makes this actor useful for extracting Substack podcast episode lists without an RSS parser. Filter records where type == "podcast" and podcast_url is non-null.

Podcast posts are billed at the podcast-post premium event ($2.50/1K, set in the Apify console) because they include a direct audio file URL — useful for media monitoring tools, podcast discovery platforms, and content aggregators that need the actual audio stream. Regular newsletter posts are billed at the standard post-item rate ($1.50/1K).

Monitoring use case

Track a competitor's newsletter for new posts — run daily and filter by post_date to see only new content. Set maxPosts=20 for fast incremental runs.

Use with AI Agents (MCP)

This Substack scraper is callable as a tool by AI agents (Claude Desktop, Cursor, VS Code, n8n, or any MCP-compatible client) via Apify's hosted Model Context Protocol server.

{
  "mcpServers": {
    "apify": {
      "command": "npx",
      "args": [
        "mcp-remote",
        "https://mcp.apify.com/?tools=bovi/substack-publication",
        "--header",
        "Authorization: Bearer <YOUR_APIFY_TOKEN>"
      ]
    }
  }
}

Integrations

Built for newsletter researchers, content agencies, and AI-pipeline teams ingesting Substack post metadata at scale — the JSON/dataset output drops into the tools you already run, no glue code:

n8n / Make / Zapier — trigger a run or pipe every new dataset item into 500+ apps (Google Sheets, Airtable, Slack, HubSpot, your database) with no code: n8n, Make, Zapier.
Webhooks — fire your own endpoint the moment a run finishes, to push results straight into your pipeline (docs).
MCP server — expose this actor as a tool to Claude, Cursor, or any MCP client so an AI agent can pull this data mid-conversation (guide).
API & SDKs — fetch the dataset as JSON, CSV, or Excel through the Apify REST API or the Python / JS SDKs.

See all Apify integrations.

Substack Scraper: Newsletter Posts, Archives & Subscribers

perconey/substack-scraper

Scrape any Substack publication: full post archive, single post detail with body, comment counts, reactions, paid/free audience, podcast metadata. No auth, no proxies, no cookies. Uses Substack official JSON API. Pay only per result.

Perconey

Substack Publication Scraper

parseforge/substack-publication-scraper

Pull every public post from any Substack publication with title, subtitle, body preview, author, publish date, podcast URL, audience type, comment count, and reactions. Filter by post type and date range. Export to JSON, CSV, or Excel for newsletter research and competitive intelligence.

ParseForge

Substack Posts Scraper 📚

easyapi/substack-posts-scraper

Scrape Substack posts and articles by keywords. Extract comprehensive post data including title, author, publication details, podcast information, reactions, and more. Perfect for content analysis and research.

EasyApi

210

1.9

Substack Newsletter Scraper

dataharvest/substack-scraper

Scrape Substack newsletters, posts and comments.

Alex v

Substack Post Scraper

seemuapps/substack-post-scraper

Scrape all posts from any Substack publication. Title, publish date, likes, comments, restacks, word count, paywall status, and author for every post in the archive.

Andrew

Substack Scraper — Posts, Authors & Newsletter Data

sian.agency/substack-scraper

Substack newsletter scraper for any publication. Extract posts: title, subtitle, author, date, reactions, comments, restacks, word count, cover image — plus full article HTML in detail mode. Search by handle, subdomain or custom domain. Clean JSON/CSV, no-code, no API key needed.

SIÁN OÜ

Substack Scraper

sheshinmcfly/substack-scraper

Scrape posts from any Substack publication (subdomain or custom domain). Get title, subtitle, description, word count, reactions, restacks, comment counts, tags, authors, and publication metadata.

Sheshinmcfly

Substack Scraper - Posts, Authors, Reactions & Newsletters

makework36/substack-scraper

Scrape Substack newsletters via official API. Title, author, bio, audience (free/paid), reactions, comments, cover, podcast duration. HTTP only, $5/1K.

deusex machine

Substack Scraper

noximilian/substack-scraper

Scrape Substack newsletters — fetch post archives, individual posts, comments, recommendations, and publication metadata. Search Substack for publications and content. No auth required for public content.

Noximilian

Substack Posts — Public Feed by Newsletter Slug

v0iddo/substack-newsletter-posts

Pull Substack newsletter posts via the public {slug}.substack.com/feed RSS endpoint. One row per post with title, link, author, pubDate, summary, category. No auth required.