Pricing

from $0.35 / 1,000 posts

Substack Scraper

Extract complete data from Substack newsletters including posts, authors, engagement metrics, and article text. 13 fields per post. Fast and reliable.

Pricing

from $0.35 / 1,000 posts

Rating

2.6

(2)

Developer

LIAICHI MUSTAPHA

Actor stats

Bookmarked

Total users

Monthly active users

4.9 hours

Issues response

a day ago

Last modified

Features

13 data fields per post — headline, subheading, author, date, likes, comments, restacks, article text, and more
Full article text extraction (preview text for paywalled posts)
Engagement metrics — likes, comments, and restacks per post
Two scraping methods — sitemap (fast, recommended) and archive page (fallback)
Batch processing — scrape dozens of newsletters in a single run
Dynamic memory — scales automatically, no manual configuration needed

Use Cases

AI training data — Build large text datasets from thousands of Substack articles
Competitive analysis — Track what newsletters in your niche publish and what resonates
Content research — Identify trending topics and high-engagement post formats
Newsletter audits — Analyze posting frequency, author mix, and free/paid ratio
Market research — Monitor thought leaders and industry publications at scale

Input

Field	Type	Required	Default	Description
`substackUrls`	Array	Yes	—	Substack newsletter URLs (e.g. `https://example.substack.com`)
`scrapingMethod`	String	No	`sitemap`	`"sitemap"` (faster) or `"archive"` (fallback)
`maxPostsPerSubstack`	Integer	No	`0`	Posts per newsletter — `0` means unlimited
`batchSize`	Integer	No	`20`	Newsletters processed per batch

Example input:

{
  "substackUrls": [
    "https://tedhope.substack.com",
    "https://stratechery.com"
  ],
  "scrapingMethod": "sitemap",
  "maxPostsPerSubstack": 100,
  "batchSize": 20
}

Output

Each item in the dataset represents one Substack post:

{
  "substack_url": "https://tedhope.substack.com",
  "post_url": "https://tedhope.substack.com/p/the-regeneration-will-be-live",
  "headline": "The Regeneration Will Be Live",
  "subheading": "Why in-person experiences are making a comeback",
  "author_name": "Ted Hope",
  "author_url": "https://substack.com/@tedhope",
  "date": "December 10, 2024",
  "free_or_paid": "Free",
  "likes": 156,
  "comments": 23,
  "restacks": 12,
  "article_text": "Full article content here...",
  "content_type": "full"
}

content_type values: "full" (complete text), "preview_only" (paywalled), "failed" (extraction error).

How to Use

Via Apify Console

Open the actor on Apify Store
Click Try for free
Paste your Substack URLs into the Substack URLs field
Optionally set maxPostsPerSubstack to limit results per newsletter
Click Start and wait for the run to complete
Download your results as JSON, CSV, or Excel

Via Apify API (Python)

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")

run = client.actor("USERNAME/substack-scraper").call(run_input={
    "substackUrls": ["https://tedhope.substack.com"],
    "scrapingMethod": "sitemap",
    "maxPostsPerSubstack": 100
})

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item["headline"], item["likes"])

Via Apify API (JavaScript)

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });

const run = await client.actor('USERNAME/substack-scraper').call({
  substackUrls: ['https://tedhope.substack.com'],
  maxPostsPerSubstack: 100,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();

Pricing

This actor is billed by compute units — you pay only for what you use.

Scale	Approximate Cost	Estimated Time
1 newsletter, 100 posts	< $0.01	~30 seconds
10 newsletters, 100 posts each	$0.01–$0.05	2–5 minutes
100 newsletters	$0.50–$1.00	30–60 minutes
1,000 newsletters	$5–$10	5–10 hours

New to Apify? Every account includes $5 in free monthly credits — enough to scrape thousands of posts at no cost.

FAQ

Can I scrape paywalled Substack posts? You'll receive preview text for paid posts, not the full article. Full content is extracted for free posts only. Login-based access is not currently supported.

How many newsletters can I scrape in one run? There is no hard limit. For large-scale runs (100+ newsletters), lower batchSize to 5–10 for more stable results.

What is the difference between sitemap and archive scraping methods? Sitemap is faster and more reliable — it discovers all posts directly from the newsletter's XML sitemap. Archive page is a fallback for newsletters that don't publish a sitemap.

How do I get only the most recent posts? Set maxPostsPerSubstack to a small number (e.g. 10). Posts are returned newest-first, so you'll always get the latest content.

How do I use the scraped data for AI training? The article_text field contains the full article body. Export the dataset as JSON or CSV, then load it into your training pipeline. Filter by content_type: "full" to exclude previews.

Is scraping Substack legal? This actor accesses only publicly available content. Always respect Substack's robots.txt, use reasonable rate limits, and comply with applicable terms of service and data regulations.

Changelog

v1.0.4 — April 2, 2026

Fixed: Incomplete scraping results caused by mishandled nested sitemap indexes
Fixed: Memory misconfiguration (default was incorrectly set to 16GB)
Improved: Cost efficiency — single-newsletter runs now ~94% cheaper

v1.0.0 — Initial Release

Full post scraping via sitemap and archive methods
13 data fields per post including engagement metrics
Batch processing support
Paid/free post detection

Built by Mustapha Liaichi — Automation & Web Scraping Specialist

Substack Scraper — Posts, Authors & Newsletters

cryptosignals/substack-scraper

Extract Substack newsletter content. Get post titles, authors, publish dates, paywall status, subscriber counts, and full article text. Ideal for newsletter research and content monitoring. PPE pricing — pay only for results.

Web Data Labs

Substack Posts Scraper 📚

easyapi/substack-posts-scraper

Scrape Substack posts and articles by keywords. Extract comprehensive post data including title, author, publication details, podcast information, reactions, and more. Perfect for content analysis and research.

EasyApi

176

1.9

Substack Leaderboard Scraper 📊

easyapi/substack-leaderboard-scraper

Scrape detailed publication data from Substack leaderboards. Get comprehensive insights about top newsletters including subscriber counts, pricing, author details, and more. Perfect for newsletter research and market analysis.

EasyApi

Substack Scraper

automation-lab/substack-scraper

Scrape Substack newsletters — posts, comments, publication metadata. Full archive depth with no caps. Export to JSON, CSV, Excel, or connect via API.

Stas Persiianenko

195

Substack Notes Scraper 🔍

easyapi/substack-notes-scraper

Extract notes and comments from Substack's search results with images, user info, and engagement metrics. Perfect for content analysis, user research, and tracking discussions around specific topics on Substack.

EasyApi

Substack Scraper

qpayre/substack-scraper

The Substack Author Scraper is a powerful Apify actor that makes it easy for content creators to scrape and retrieve all posts from their favorite Substack authors. With structured data presented in a user-friendly format, analyzing and processing valuable information has never been easier.

QPS

453

Substack Newsletter Scraper

digispruce/substack-scraper

Extract comprehensive Substack newsletter data including author profiles, subscriber counts, social media links, and contact information for B2B outreach and market research.

Akram

4.0

Substack Scraper | All-In-One

fatihtahta/substack-scraper

Get full articles, user profiles, and search results with All-in-One Substack Scraper. Extract rich data including titles, bios, subscriber counts, social links and engagement metrics. ideal for market research, creator discovery, trend tracking, and audience analysis.

Fatih Tahta

141

YouTube Video Details Scraper

maged120/youtube-video-details

Extract full metadata from any YouTube video or Short — title, views, likes, comments, subtitles, chapters, tags, and more. No YouTube API key needed.

Maged

Bizquest [Only $1💰] Scraper (/w EMAILS)

memo23/bizquest-scraper

💰$1 per 1K BizQuest businesses, asset sales & franchises from search or detail URLs: title, URL, asking price, location, industry, cash flow, EBITDA, gross income, inventory, employees, years operating, address, summary, broker contact, facilities, financing, flags nested details + raw API fields