Pricing

$1.00 / 1,000 result items

Substack Scraper: Newsletter Posts, Archives & Subscribers

Scrape any Substack publication: full post archive, single post detail with body, comment counts, reactions, paid/free audience, podcast metadata. No auth, no proxies, no cookies. Uses Substack official JSON API. Pay only per result.

Pricing

$1.00 / 1,000 result items

Rating

0.0

(0)

Developer

Perconey

Actor stats

Bookmarked

Total users

Monthly active users

22 days ago

Last modified

Why use this Substack scraper?

No auth, no setup - point at any publication URL and start scraping. Substack's archive API is fully public.
Real data, full fidelity - 25+ fields per post: title, subtitle, body HTML, reaction counts, comment counts, audience tier (free/paid), podcast duration, author bylines, cover image, canonical URL.
Custom-domain support - we follow redirects, so lennysnewsletter.com works exactly like *.substack.com.
Date range filters - since and until keep your runs scoped and cheap when monitoring on a schedule.
Pay-per-result pricing - you only pay for posts you actually receive. Stop a run any time and the meter stops.
API + scheduler + integrations - run from the Apify API, cron-schedule, fire webhooks into Slack/Make/Zapier on every new post.

Use cases

Competitor / market research - track top newsletters in your niche, see what topics drive reactions, monitor publication cadence.
Content audits - export an entire newsletter's archive to spreadsheet, sort by reaction count, find your best-performing posts.
Lead generation - identify active subscribers / commenters on industry newsletters (comment counts are public).
NLP / research datasets - bulk-export posts with body text and metadata for sentiment, topic modeling, embedding indexes.
News monitoring - schedule daily scrapes of trade newsletters, alert on new posts via Apify webhook.
Migration backups - if you run a Substack, this is the easiest way to back up your full archive as JSON.

How to use the Substack scraper

Pick an action from the What do you want to scrape? dropdown: getArchive for a publication's post list, getPost for one post's full body.
Fill Substack publication URLs - subdomain (https://stratechery.substack.com) or custom domain (https://www.lennysnewsletter.com).
Set Max posts per publication (default 100, 0 = unlimited).
Optional: since / until ISO dates, audience filter (everyone / only_paid / only_free), includeBody to fetch full HTML body for every post.
Click Save & Start. Results stream into the Dataset tab in real time. Export as JSON, CSV, Excel, HTML or XML.

Input

Field	Required	What it does
`action`	yes	`getArchive` or `getPost`
`publications`	yes	Publication URLs (getArchive) or post URLs (getPost), one per line
`maxItems`	no	Max posts per publication for getArchive (0 = unlimited)
`since` / `until`	no	ISO date filter for post_date
`audience`	no	`everyone` (default), `only_paid`, or `only_free`
`includeBody`	no	Default false. Set true to fetch each post's body_html in the archive run (one extra API call per post).

Output

Every run produces one Dataset with one item per result. Real example - getArchive on lennysnewsletter.com, 3 most recent posts:

[
  { "_type": "post", "_action": "getArchive", "_publication": "https://www.lennysnewsletter.com",
    "title": "Why SaaS freemium playbooks don't work in AI, and what to do instead",
    "post_date": "2026-05-05T...", "audience": "only_paid",
    "canonical_url": "https://www.lennysnewsletter.com/p/...",
    "reaction_count": 292, "comment_count": 187, "word_count": 4521,
    "author_name": "Lenny Rachitsky" },
  { "_type": "post", "title": "Your Couch-to-5K for AI", "audience": "only_paid",
    "reaction_count": 363, "comment_count": 142 },
  { "_type": "post", "title": "New: A free year of Cursor, Google AI Pro...",
    "audience": "everyone", "reaction_count": 297, "comment_count": 88 }
]

You can download the dataset in various formats such as JSON, JSON-Lines, CSV, Excel, HTML or XML, or fetch programmatically via the Apify Dataset API.

Data fields

Field	Type	Description
`_type`	string	`post` or `error`
`_action`	string	The action that produced this row
`_publication`	string	Publication base URL
`id` / `slug`	string	Substack internal id and url slug
`title` / `subtitle` / `description`	string	Headlines
`post_date`	ISO 8601	When the post was published
`type`	string	`newsletter`, `podcast`, `thread`, etc.
`audience`	string	`everyone` (free), `only_paid`, `only_free`
`canonical_url`	string	Web URL of the post
`cover_image`	string	Hero image URL
`word_count`	int	Word count (may be null for some posts)
`reaction_count` / `reactions`	int / object	Heart count + per-emoji breakdown
`comment_count` / `child_comment_count`	int	Top-level and total comments
`podcast_duration`	int	Seconds (podcast posts only)
`author_name` / `author_handle` / `author_id`	string	Primary author
`body_html`	string	Full HTML body. Empty in archive mode unless `includeBody=true`; always populated in `getPost` mode.

Pricing - what does scraping Substack cost?

Pricing is pay-per-result - you pay only for posts you receive. A budget cap on each run means you never spend more than you allow.

Sample budgets at the published per-item price:

Use case	Items	Approx. cost
Monitor a newsletter for 100 latest posts	100	~$0.10
Full archive of a 500-post publication	500	~$0.50
Daily scheduled scrape, 5 new posts/day, 30 days	150	~$0.15
50-publication competitive scan, 20 posts each	1 000	~$1.00

See the Pricing section on this page for the exact per-item rate.

Tips & advanced options

Schedule it. Set since to the last-run timestamp in your scheduled task so each run only ingests new posts. Cost stays flat regardless of publication age.
Skip bodies by default. getArchive returns metadata only by default - that's enough to know what's new. Use getPost separately when you actually need a full post's body.
Custom domains work transparently. https://www.lennysnewsletter.com and https://lenny.substack.com both work; we follow Substack's redirects automatically.
Audience filter is useful for "free-only" datasets. Paid posts return teaser content only - if you don't want them, set audience: only_free.

Integrations

REST API - POST /v2/acts/perconey~substack-scraper/runs
Scheduler - cron-style in Apify console
Webhooks - Slack, Discord, custom endpoints on RUN_SUCCEEDED
Sheets / Notion / Airtable / Google Drive via Apify Integrations
Make / Zapier / n8n via the same catalog

FAQ

Do I need a Substack account? No. Every action works fully anonymously.

What about paid posts? Paid posts return teaser content (title, subtitle, preview, audience flag). The body and full post is behind the paywall - the actor surfaces what Substack itself exposes to non-subscribers.

Is this allowed by Substack's terms of service? The actor uses Substack's public JSON API the same way the web app does. Public data is publicly readable. Use the results responsibly - respect privacy, attribute creators, don't redistribute paid-only content.

Can I scrape comments too? The comment_count field is included on every post. Full comment threads are a separate Substack endpoint - planned for a future release.

Rate limits? Substack is generally lenient on the public archive endpoint. The actor paces ~6 req/s, retries on 429 with Retry-After. Heavy parallel runs may still hit limits - start with one run.

Support & feedback

Bug, feature, or custom version? Open an issue from the Issues tab on the Apify page, or message @perconey.bsky.social on Bluesky.

Disclaimer: this scraper reads public Substack data only. Don't use it to harass writers, scrape paid content for redistribution, or violate Substack's Terms of Use.

Substack Newsletter Scraper

opalescent_quintet/substack-newsletter-scraper

Substack-Newsletter-Scraper Extract complete newsletter archives from any Substack publication with advanced filtering, multiple export formats, and engagement analytics. ## Features - Scrape entire newsletter archives from Substack - Extract full metadata: titles, content, author details etc.

Aryan

Substack Publication Scraper

parseforge/substack-publication-scraper

Pull every public post from any Substack publication with title, subtitle, body preview, author, publish date, podcast URL, audience type, comment count, and reactions. Filter by post type and date range. Export to JSON, CSV, or Excel for newsletter research and competitive intelligence.

ParseForge

Substack Newsletter Scraper

red.cars/substack-newsletter-scraper

Extract newsletter content, subscriber data, and author insights from any Substack publication - no API key required!

AutomateLab

1.0

Substack Posts Scraper 📚

easyapi/substack-posts-scraper

Scrape Substack posts and articles by keywords. Extract comprehensive post data including title, author, publication details, podcast information, reactions, and more. Perfect for content analysis and research.

EasyApi

158

1.9

Substack Newsletter Scraper - Articles, Metadata & Full Content

hata1234/substack-scraper

Extract articles, metadata, and content from any Substack newsletter via public API. No proxy needed. Supports multiple newsletters, full article body extraction, audience filtering (free/paid), date range, keyword search, and pagination. Works with both substack.com subdomains and custom domains.

Moris Chao

Substack Scraper

crawlerbros/substack-scraper

Scrape Substack publications via the public RSS feed of any newsletter. Extract post title, URL, author, publication date, body HTML, categories, and enclosures. HTTP-only with TLS impersonation (no auth, no proxy).

Crawler Bros

5.0

Substack Scraper

automation-lab/substack-scraper

Scrape Substack newsletters — posts, comments, publication metadata. Full archive depth with no caps. Export to JSON, CSV, Excel, or connect via API.

Stas Persiianenko

166

Substack Newsletter Scraper

boundary/substack-newsletter-scraper

Scrape Substack newsletter posts — titles, content, reactions, comments, tags, and author data. Supports custom domains. No login needed.

Boundary

Substack Email Scraper

scrapapi/substack-email-scraper

ScrapAPI

Substack Scraper — Newsletters, Posts, Authors & Subscribers

logiover/substack-newsletter-scraper

Discover Substack newsletters by category & leaderboard rank, then pull every post, author and publication. 30+ categories or direct subdomain. Per post: title, audience (free/paid), reactions, restacks. Per pub: subdomain, custom domain, author, subscription tiers. Public Substack API — no auth.