Pricing

$5.00 / 1,000 result scrapeds

Substack Scraper — Posts, Authors & Newsletters

Extract Substack newsletter content. Get post titles, authors, publish dates, paywall status, subscriber counts, and full article text. Ideal for newsletter research and content monitoring. PPE pricing — pay only for results.

Pricing

$5.00 / 1,000 result scrapeds

Rating

0.0

(0)

Developer

Web Data Labs

Actor stats

Bookmarked

Total users

Monthly active users

21 days ago

Last modified

Substack Scraper — Posts, Comments & Publication Data

Extract structured data from any Substack newsletter at scale. Scrape posts with full article text, reader comments, and publication metadata — no login required. Export to JSON, CSV, or Excel with a single click.

Why Use This Scraper?

Substack has grown into one of the most important platforms for independent journalism, thought leadership, and niche expertise. With over 35 million active subscriptions and 17,000+ paid writers, it's a goldmine for researchers, marketers, and analysts — but Substack offers no bulk export or public API.

This actor solves that. It programmatically extracts posts, comments, and publication info from any Substack newsletter, giving you clean, structured data ready for analysis.

Key Features

Three scrape modes: Posts, comments, and publication info
Search across Substack: Find posts by keyword across the entire platform
Publication-specific scraping: Target one or more newsletters by subdomain
Full article text: Optionally include the complete body text of each post
Flexible sorting: Sort by newest or top-performing posts
Scale control: Scrape from 1 to 500 items per run
No authentication needed: Works without any Substack account
Multiple export formats: JSON, CSV, Excel, XML, HTML

Use Cases

1. Content Research & Competitive Analysis

Track what topics are trending across newsletters in your industry. Monitor competitors' publishing frequency, engagement, and content strategy.

2. Media Monitoring & PR Intelligence

Set up regular scrapes to track mentions of your brand, product, or industry across Substack newsletters. Stay ahead of narratives before they hit mainstream media.

3. Academic & Market Research

Collect large datasets of expert opinion pieces, industry analysis, and commentary for qualitative research. Study how narratives form and spread through independent media.

Search for newsletters covering specific topics, then scrape their publication info to evaluate subscriber counts, posting cadence, and content quality.

5. Sentiment & Trend Analysis

Extract posts about specific topics or companies, then run NLP or sentiment analysis on the text. Detect shifts in expert opinion over time.

6. Lead Generation for B2B

Find Substack authors writing about your domain and extract their publication details. These are high-value contacts who are actively engaged in your space.

7. Content Repurposing & Summarization

Pull posts from newsletters you subscribe to and feed them into LLMs for summarization, translation, or content repurposing workflows.

Input Parameters

Parameter	Type	Required	Default	Description
`publications`	Array of strings	No	—	Substack subdomains to scrape (e.g., `platformer` for platformer.substack.com)
`searchQuery`	String	No	—	Search keyword to find posts across all of Substack
`scrapeType`	String	No	`posts`	What to scrape: `posts`, `comments`, or `info`
`maxItems`	Integer	No	`50`	Maximum items to return (1–500)
`sortBy`	String	No	`new`	Sort order: `new` (newest first) or `top` (most popular)
`includeBodyText`	Boolean	No	`false`	Include the full body text of each post

Tip: Use publications to target specific newsletters, or searchQuery to search across the entire platform. You can combine both.

Sample Output

Posts Output

{
  "title": "The AI Trust Crisis",
  "subtitle": "Why users are losing faith in AI-generated content",
  "slug": "the-ai-trust-crisis",
  "publishedAt": "2026-03-01T10:30:00.000Z",
  "canonicalUrl": "https://platformer.substack.com/p/the-ai-trust-crisis",
  "author": "Casey Newton",
  "publicationName": "Platformer",
  "publicationSubdomain": "platformer",
  "likes": 847,
  "comments": 132,
  "wordCount": 2450,
  "isPaywalled": false,
  "previewText": "The past month has brought a reckoning for AI companies...",
  "coverImage": "https://substackcdn.com/image/fetch/...",
  "tags": ["AI", "trust", "technology"]
}

Comments Output

{
  "body": "This is exactly what I've been seeing in my industry...",
  "author": "John Reader",
  "date": "2026-03-01T14:22:00.000Z",
  "likes": 23,
  "postTitle": "The AI Trust Crisis",
  "publicationSubdomain": "platformer"
}

Publication Info Output

{
  "name": "Platformer",
  "subdomain": "platformer",
  "description": "Tech and democracy coverage",
  "authorName": "Casey Newton",
  "heroImage": "https://substackcdn.com/image/fetch/...",
  "logoUrl": "https://substackcdn.com/image/fetch/...",
  "themeColor": "#FF6719",
  "subscriberCount": 250000,
  "postCount": 1200
}

Integration Examples

Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")

run_input = {
    "publications": ["platformer", "thebrowser"],
    "scrapeType": "posts",
    "maxItems": 50,
    "sortBy": "new",
    "includeBodyText": True,
}

run = client.actor("cryptosignals/substack-scraper").call(run_input=run_input)

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(f"{item['title']} — {item.get('likes', 0)} likes")

Node.js

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });

const input = {
    publications: ["platformer", "thebrowser"],
    scrapeType: "posts",
    maxItems: 50,
    sortBy: "new",
    includeBodyText: true,
};

const run = await client.actor("cryptosignals/substack-scraper").call(input);

const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach(item => {
    console.log(`${item.title} — ${item.likes || 0} likes`);
});

Using the Apify API Directly

curl -X POST "https://api.apify.com/v2/acts/cryptosignals~substack-scraper/runs?token=YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "publications": ["platformer"],
    "scrapeType": "posts",
    "maxItems": 20
  }'

Pricing & Costs

This actor runs on the Apify platform using your account's compute units (CUs).

Scenario	Estimated Cost
50 posts from one publication	~$0.01–$0.02
200 posts from multiple publications	~$0.05–$0.10
500 posts with full body text	~$0.10–$0.25

Costs depend on the number of items, whether body text is included (larger payloads), and the Apify plan you're on. Free plan users get $5/month in platform credits — enough for hundreds of scrapes.

Tips for Best Results

Start small: Set maxItems to 5–10 for your first run to verify the output format meets your needs.
Use publication subdomains: For platformer.substack.com, enter just platformer in the publications list.
Enable body text selectively: Full article text significantly increases output size. Only enable it when you need the content for analysis.
Combine with Apify integrations: Send results directly to Google Sheets, Slack, Zapier, Make, or webhooks for automated workflows.
Schedule regular runs: Set up recurring scrapes to build longitudinal datasets or monitor newsletters over time.

Frequently Asked Questions

Can I scrape paywalled/subscriber-only posts?

The scraper extracts publicly available data. For paywalled posts, you'll get the title, preview text, metadata, and publication info, but not the full subscriber-only content.

How do I find a publication's subdomain?

Look at the newsletter URL. For https://platformer.substack.com, the subdomain is platformer. For custom domains, check the Substack about page.

Can I scrape custom domain Substack newsletters?

Yes. Use the publication's original Substack subdomain (before they switched to a custom domain). You can usually find it referenced on their about page or through a web search.

How often is the data updated?

Every run fetches live data directly from Substack. You always get the latest posts, comments, and metrics.

Is there a rate limit?

The scraper handles rate limiting automatically with built-in delays and retries. You don't need to configure anything.

Can I search for posts about a specific topic?

Yes! Use the searchQuery parameter to search across all of Substack, or combine it with publications to search within specific newsletters.

What export formats are available?

Apify supports JSON, CSV, Excel (XLSX), XML, HTML, and RSS. You can download in any format from the dataset tab after a run completes.

How do I integrate this with my existing workflow?

Use Apify's built-in integrations (Zapier, Make, Google Sheets, webhooks) or call the API directly from any programming language. See the code examples above.

Can I run this on a schedule?

Yes. Apify supports cron-like scheduling. Set up daily, weekly, or custom schedules from the actor's Schedules tab. Each run stores results in a new dataset.

What happens if a publication doesn't exist?

The scraper will log a warning for invalid subdomains and continue processing the remaining publications. Your run won't fail because of one bad input.

Substack Scraper

scraper_guru/substack-scraper

Extract complete data from Substack newsletters including posts, authors, engagement metrics, and article text. 13 fields per post. Fast and reliable.

LIAICHI MUSTAPHA

2.6

Substack Leaderboard Scraper 📊

easyapi/substack-leaderboard-scraper

Scrape detailed publication data from Substack leaderboards. Get comprehensive insights about top newsletters including subscriber counts, pricing, author details, and more. Perfect for newsletter research and market analysis.

EasyApi

Substack Newsletter Scraper

digispruce/substack-scraper

Extract comprehensive Substack newsletter data including author profiles, subscriber counts, social media links, and contact information for B2B outreach and market research.

Akram

4.0

Substack Posts Scraper 📚

easyapi/substack-posts-scraper

Scrape Substack posts and articles by keywords. Extract comprehensive post data including title, author, publication details, podcast information, reactions, and more. Perfect for content analysis and research.

EasyApi

149

1.9

Substack Scraper

qpayre/substack-scraper

The Substack Author Scraper is a powerful Apify actor that makes it easy for content creators to scrape and retrieve all posts from their favorite Substack authors. With structured data presented in a user-friendly format, analyzing and processing valuable information has never been easier.

QPS

450

Substack Scraper | $2 / 1k | All-In-One

fatihtahta/substack-scraper

Get full articles, user profiles, and search results with All-in-One Substack Scraper. Extract rich data including titles, bios, subscriber counts, social links and engagement metrics. ideal for market research, creator discovery, trend tracking, and audience analysis.

Fatih Tahta

110

Substack Scraper

automation-lab/substack-scraper

Scrape Substack newsletters — posts with full content, comments with nested replies, and publication metadata. Unlimited archive depth, no proxy needed, 100% success rate. Export to JSON, CSV, Excel.

Stas Persiianenko

146

Newsletter Scraper

benthepythondev/newsletter-scraper

Extract newsletter archives from Substack, Beehiiv, and Ghost platforms. Get full content in markdown format, complete metadata, embedded images, word counts, and AI-ready token counts. Perfect for content research, competitive analysis, and training AI models.

ben

Substack Scraper - Download Newsletter Content Fast

stanvanrooy6/substack-scraper

Substack scraper for newsletters. Extract posts with titles, dates, authors, tags, and reactions.

Stan Van Rooy

Scrape Twitter (X) User Tweets by Username - Cookieless

patient_discovery/twitter-user-tweets

Extract tweets from any public Twitter account without login or cookies. Enter a username and get tweet text, engagement metrics, hashtags, media, and author profile data in JSON or CSV. Ideal for competitor analysis, sentiment tracking, and lead generation at scale.

Surge Street

Substack Scraper — Posts, Authors & Newsletters

Substack Scraper — Posts, Comments & Publication Data

Why Use This Scraper?

Key Features

Use Cases

1. Content Research & Competitive Analysis

2. Media Monitoring & PR Intelligence

3. Academic & Market Research

4. Newsletter Discovery & Curation

5. Sentiment & Trend Analysis

6. Lead Generation for B2B

7. Content Repurposing & Summarization

Input Parameters

Sample Output

Posts Output

Comments Output

Publication Info Output

Integration Examples

Python

Node.js

Using the Apify API Directly

Pricing & Costs

Tips for Best Results

Frequently Asked Questions

Can I scrape paywalled/subscriber-only posts?

How do I find a publication's subdomain?

Can I scrape custom domain Substack newsletters?

How often is the data updated?

Is there a rate limit?

Can I search for posts about a specific topic?

What export formats are available?

How do I integrate this with my existing workflow?

Can I run this on a schedule?

What happens if a publication doesn't exist?

You might also like

Substack Scraper

Substack Leaderboard Scraper 📊

Substack Newsletter Scraper

Substack Posts Scraper 📚

Substack Scraper

Substack Scraper | $2 / 1k | All-In-One

Substack Scraper

Newsletter Scraper

Substack Scraper - Download Newsletter Content Fast

Scrape Twitter (X) User Tweets by Username - Cookieless