Pricing

Pay per event

Substack Scraper

Scrape Substack newsletters — posts with full content, comments with nested replies, and publication metadata. Unlimited archive depth, no proxy needed. Export to JSON, CSV, Excel.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Stas Persiianenko

Actor stats

Bookmarked

Total users

Monthly active users

a day ago

Last modified

What does Substack Scraper do?

Substack Scraper extracts data from any Substack newsletter — posts with full HTML content, comments with nested replies, and publication metadata including subscriber counts. It supports unlimited archive depth (no 12-post cap), works with both *.substack.com and custom domain newsletters, and exports to JSON, CSV, Excel, or connects via API.

Unlike other scrapers, this actor uses Substack's public JSON API directly — no browser, no proxy, 100% success rate.

Use cases

Content analysis — Download full newsletter archives for content audits, topic analysis, or AI training datasets
Market research — Track subscriber counts, posting frequency, and engagement metrics across multiple newsletters
Lead generation — Extract author profiles, social links, and publication metadata for B2B outreach
Competitor monitoring — Monitor competing newsletters for new posts, engagement trends, and pricing changes
Academic research — Build datasets of newsletter content with comments for sentiment analysis or discourse studies

Why use Substack Scraper?

Unlimited archive depth — Scrape the complete archive of any newsletter. No 12-post cap like the market leader
100% success rate — Uses Substack's public JSON API. No anti-bot, no proxy needed, no failures
Full comment threads — Extract comments with nested replies, reaction counts, and author metadata
Publication metadata — Subscriber counts, pricing plans, author info, and 100+ publication fields
No proxy cost — Direct API access means zero proxy fees. Runs on minimal 256MB memory
Clean pay-per-event pricing — No hidden start fees or completion charges. Pay only for results
66+ fields per post — The richest output of any Substack scraper on Apify Store
Custom domain support — Works with both newsletter.substack.com and custom domains like www.lennysnewsletter.com

What data can you extract?

Per post (30+ fields):

Field	Description
`title`, `subtitle`, `slug`	Post title, subtitle, and URL slug
`url`	Full canonical URL
`publishedAt`, `updatedAt`	Publication and update timestamps
`postType`	`newsletter`, `podcast`, or `thread`
`audience`, `isPaid`	Paywall status (`everyone` or `only_paid`)
`bodyHtml`	Full HTML content (free posts)
`wordcount`	Total word count (even for paid posts)
`coverImage`	Cover image URL
`tags`	Post tags/categories
`reactionCount`, `commentCount`, `restacks`	Engagement metrics
`authorName`, `authorHandle`, `authorBio`	Author information
`publicationName`, `subscriberCount`	Newsletter metadata

Per comment (12 fields): body, date, name, handle, reactionCount, isAuthor, isPinned, nested replies

Per publication: name, subscriberCount, baseUrl, paymentsEnabled, logoUrl, heroText, language

How much does it cost to scrape Substack?

This Actor uses pay-per-event pricing — you pay only for what you scrape. No monthly subscription. All platform costs are included.

Event	Free plan	Starter ($49/mo)	Scale ($499/mo)
Start	$0.005	$0.004	$0.003
Per post (metadata)	$0.001	$0.0008	$0.0006
Per post (with content)	$0.002	$0.0017	$0.0014
Per comment	$0.0005	$0.0004	$0.0003

Real-world cost examples:

Scenario	Results	Duration	Cost (Free tier)
1 newsletter, 50 posts (metadata)	50 posts	~3s	~$0.06
1 newsletter, 50 posts (with content)	50 posts	~5s	~$0.11
1 newsletter, 50 posts + comments	50 posts + ~200 comments	~15s	~$0.21
1 newsletter, full archive (500 posts)	500 posts	~30s	~$1.01
5 newsletters, 100 posts each	500 posts	~60s	~$1.03

How to scrape Substack newsletters

Go to the Substack Scraper page on Apify Store
Enter one or more newsletter URLs (e.g., https://www.lennysnewsletter.com)
Choose your output options (content, comments, publication info)
Set filters if needed (date range, content type, free posts only)
Click Start and wait for results
Download your data in JSON, CSV, Excel, or connect via API

Input parameters

Parameter	Type	Default	Description
`urls`	array	required	Substack newsletter URLs. Accepts homepage, custom domain, post URLs, or /archive URLs
`maxPostsPerNewsletter`	integer	`100`	Max posts per newsletter. `0` = unlimited (full archive)
`includeContent`	boolean	`true`	Include full HTML body. Disable for metadata-only (faster, cheaper)
`includeComments`	boolean	`false`	Fetch comments for each post. Adds one API call per post
`includePublicationInfo`	boolean	`true`	Include newsletter metadata (subscriber count, pricing, author)
`contentType`	string	`all`	Filter: `all`, `newsletter`, `podcast`, or `thread`
`startDate`	string	—	Only posts after this date (YYYY-MM-DD)
`endDate`	string	—	Only posts before this date (YYYY-MM-DD)
`onlyFree`	boolean	`false`	Only include free posts. Skip paywalled content

Output example

{
    "postId": 186226252,
    "title": "How to build AI product sense",
    "subtitle": "The secret is using Cursor for non-technical work",
    "slug": "how-to-build-ai-product-sense",
    "url": "https://www.lennysnewsletter.com/p/how-to-build-ai-product-sense",
    "publishedAt": "2026-02-03T13:45:58.303Z",
    "updatedAt": "2026-02-04T17:29:56.949Z",
    "postType": "newsletter",
    "audience": "everyone",
    "isPaid": false,
    "wordcount": 5867,
    "coverImage": "https://substackcdn.com/image/fetch/...",
    "tags": ["AI"],
    "reactionCount": 298,
    "commentCount": 31,
    "childCommentCount": 15,
    "restacks": 20,
    "hasVoiceover": false,
    "bodyHtml": "<div class=\"body markup\">...</div>",
    "authorName": "Tal Raviv",
    "authorHandle": "talsraviv",
    "publicationName": "Lenny's Newsletter",
    "subscriberCount": "1,100,000",
    "comments": [
        {
            "id": 209331673,
            "body": "This article creates a whole new paradigm for learning...",
            "date": "2026-02-03T15:34:25.318Z",
            "name": "Jack Cohen",
            "handle": "jackcohen10",
            "reactionCount": 9,
            "isAuthor": false,
            "replies": [
                {
                    "id": 209340123,
                    "body": "Thanks Jack!",
                    "name": "Tal Raviv",
                    "isAuthor": true,
                    "replies": []
                }
            ]
        }
    ],
    "scrapedAt": "2026-02-06T02:07:09.750Z"
}

Tips for best results

Start with metadata-only (includeContent: false) to quickly survey a newsletter's archive before doing a full content scrape
Use date filters to scrape only recent posts instead of full archives — saves time and money
Comments are optional — each post with comments requires an extra API call, so only enable when needed
Paid posts return all metadata (title, wordcount, reactions) but bodyHtml will be empty
Custom domains work the same as *.substack.com URLs — just paste the full URL
Use maxPostsPerNewsletter: 0 for unlimited archive depth — scrapes every post ever published

Integrations

Connect Substack Scraper with your existing tools:

Make — Automate workflows triggered by new newsletter data
Zapier — Connect to 5,000+ apps
Google Sheets — Export directly to spreadsheets
Slack — Get notifications for new posts
GitHub — Trigger workflows on new data
Webhooks — Send data to any endpoint

Using the Apify API

Node.js:

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });

const run = await client.actor('automation-lab/substack-scraper').call({
    urls: ['https://www.lennysnewsletter.com'],
    maxPostsPerNewsletter: 50,
    includeContent: true,
    includeComments: false,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

Python:

from apify_client import ApifyClient

client = ApifyClient('YOUR_API_TOKEN')

run = client.actor('automation-lab/substack-scraper').call(run_input={
    'urls': ['https://www.lennysnewsletter.com'],
    'maxPostsPerNewsletter': 50,
    'includeContent': True,
    'includeComments': False,
})

items = client.dataset(run['defaultDatasetId']).list_items().items
print(items)

FAQ

How fast is the scraper? Very fast. 50 posts (metadata only) complete in ~3 seconds. 50 posts with full content in ~5 seconds. Full archives of 500+ posts finish in under 30 seconds. No browser or proxy overhead.

Can I scrape paid/paywalled posts? You get all metadata for paid posts (title, subtitle, wordcount, reactions, comments count) but bodyHtml will be empty since content access requires an active subscription.

Does it work with custom domains? Yes. Enter the full URL (e.g., https://www.lennysnewsletter.com) and the scraper auto-detects it as a Substack newsletter.

How many posts can I scrape? There is no limit. Set maxPostsPerNewsletter: 0 to scrape the complete archive. This is the only Substack scraper on Apify with unlimited archive depth.

Does it extract comments? Yes. Set includeComments: true to get full comment threads with nested replies, author info, and reaction counts. Each post with comments requires one extra API call.

What about rate limits? Substack's public API has no detected rate limits. The scraper adds a polite delay between requests to be respectful.

Reddit Scraper — Scrape Reddit posts, comments, and subreddit data
YouTube Transcript Scraper — Extract transcripts from YouTube videos

Substack Publications Scraper 📚

easyapi/substack-publications-scraper

Scrape detailed publication information from Substack based on keywords. Get comprehensive data about newsletters, authors, subscriber counts, and publication metrics in structured JSON format.

EasyApi

1.2

Substack Scraper - Download Newsletter Content Fast

stanvanrooy6/substack-scraper

Substack scraper for newsletters. Extract posts with titles, dates, authors, tags, and reactions.

Stan Van Rooy

Substack Posts Scraper 📚

easyapi/substack-posts-scraper

Scrape Substack posts and articles by keywords. Extract comprehensive post data including title, author, publication details, podcast information, reactions, and more. Perfect for content analysis and research.

EasyApi

1.9

Substack Scraper

qpayre/substack-scraper

The Substack Author Scraper is a powerful Apify actor that makes it easy for content creators to scrape and retrieve all posts from their favorite Substack authors. With structured data presented in a user-friendly format, analyzing and processing valuable information has never been easier.

QPS

398

Substack Scraper

scraper_guru/substack-scraper

Extract complete data from Substack newsletters including posts, authors, engagement metrics, and article text. 13 fields per post. Fast and reliable.

LIAICHI MUSTAPHA

5.0

Substack Leaderboard Scraper 📊

easyapi/substack-leaderboard-scraper

Scrape detailed publication data from Substack leaderboards. Get comprehensive insights about top newsletters including subscriber counts, pricing, author details, and more. Perfect for newsletter research and market analysis.

EasyApi

5.0

Substack Notes Scraper 🔍

easyapi/substack-notes-scraper

Extract notes and comments from Substack's search results with images, user info, and engagement metrics. Perfect for content analysis, user research, and tracking discussions around specific topics on Substack.

EasyApi

Advanced Clutch.co Scraper

saswave/advanced-clutch-co-scraper

Advanced Clutch scraper. Runtime optimised for less apify credit consumptions. Extract data from companies / agencies. (phonenumber, website, social account, employees number, company identification number and more). Help you build your TAM

SASWAVE

153

3.0

Clutch.co Listings Scraper

piotrv1001/clutch-listings-scraper

The Clutch.co Listings Scraper extracts paginated business data from Clutch.co URLs, capturing company titles, logos, hourly rates, reviews, ratings, and offered services—ideal for market research and competitor analysis.

FalconScrape

275

1.0

Clutch.co Search Results Scraper ( Pay Per Results)

gopalakrishnan/clutch-companies-scraper

This Apify actor swiftly scrapes Clutch.co search results for lead generation. Get essential company data—names, URLs, ratings—for $2/1000 results. Integrates seamlessly with enrichment tools. Fast, efficient, and cost-effective!