Substack Newsletter Scraper
Pricing
from $3.00 / 1,000 posts
Substack Newsletter Scraper
Scrape Substack newsletter posts — titles, content, reactions, comments, tags, and author data. Supports custom domains. No login needed.
Pricing
from $3.00 / 1,000 posts
Rating
0.0
(0)
Developer

Boundary
Actor stats
1
Bookmarked
2
Total users
1
Monthly active users
4 days ago
Last modified
Categories
Share
Scrape posts from any Substack newsletter — titles, full content, reactions, comments, tags, author data, and more. Works with custom domains. No login or API key needed.
Features
- Extracts full post data: title, subtitle, full HTML content, description, word count, cover image, tags
- Scrapes engagement metrics: reactions, comments (with text and reactions), restacks
- Includes author info: name, handle, bio, profile image, Twitter handle
- Includes newsletter info: name, subdomain, custom domain, logo, description
- Captures podcast and audio: podcast URLs, TTS audio URLs when available
- Accepts flexible input: Substack URLs or custom domains — just paste from your browser
- Handles pagination automatically — scrape hundreds or thousands of posts
- No authentication required — uses Substack's public API
- Supports multiple newsletters in a single run
- Lightweight — uses HTTP requests only, no browser needed
How much does it cost?
With Substack Newsletter Scraper, you can scrape 1,000 posts for $2.00 in Apify platform credits ($0.002 per post).
Running on the free plan? You can scrape several hundred posts per month at no cost.
Input
| Field | Description | Default |
|---|---|---|
| Newsletter URLs | Paste newsletter URLs from your browser (e.g. https://www.lennysnewsletter.com). Works with both Substack subdomains (name.substack.com) and custom domains. | Required |
| Post Limit | Maximum posts per newsletter (0 = unlimited) | 100 |
| Not Older Than | Stop scraping when posts are older than this date (YYYY-MM-DD) | No limit |
Output example
Each post is saved as a JSON object:
{"postId": 190767098,"title": "BOOM: Senate Votes to Block Private Equity from Buying Homes","subtitle": "For 15 years, elites have fought against homeownership...","slug": "boom-senate-votes-to-block-private","url": "https://www.thebignewsletter.com/p/boom-senate-votes-to-block-private","date": "2026-03-13T22:39:57.833Z","updatedAt": "2026-03-13T23:39:58.809Z","type": "newsletter","audience": "everyone","description": "For 15 years, elites have fought against homeownership...","contentHtml": "<p>Until the Iran war, the main political pressure on...</p>","contentText": "Until the Iran war, the main political pressure on...","wordCount": 2040,"coverImageUrl": "https://substackcdn.com/image/fetch/...","sectionName": null,"reactionCount": 320,"reactions": { "❤": 320 },"commentCount": 8,"restackCount": 42,"tags": ["Private equity", "housing", "Wall Street"],"authorName": "Matt Stoller","authorHandle": "mattstoller","authorBio": "Matt Stoller is Research Director for the American Economic...","authorImageUrl": "https://bucketeer-e05bbc84-baa3-437e.../915fa1b4-....png","authorTwitter": null,"newsletterName": "BIG by Matt Stoller","newsletterSubdomain": "mattstoller","newsletterDomain": "www.thebignewsletter.com","newsletterLogoUrl": "https://bucketeer-e05bbc84-baa3-437e.../c12cbcf7-....png","newsletterDescription": "The history and politics of monopoly power.","podcastUrl": null,"podcastDuration": null,"ttsAudioUrl": "https://substack-video.s3.amazonaws.com/video_upload/post/...","comments": [{"name": "marku52","body": "I don't understand how they intend to keep Wall Street from...","date": "2026-03-14T00:36:45.063Z","reactions": { "❤": 20 }}],"scrapedAt": "2026-03-14T11:40:28.777Z"}
All output fields
| Field | Type | Description |
|---|---|---|
postId | number | Substack post ID |
title | string | Post title |
subtitle | string | Post subtitle |
slug | string | URL slug |
url | string | Full post URL |
date | string | Publish date (ISO 8601) |
updatedAt | string | Last update date (ISO 8601) |
type | string | Post type (newsletter or podcast) |
audience | string | everyone = free, only_paid = paywalled, only_free = free-only |
description | string | Short description / preview text |
contentHtml | string | Full HTML content for free posts. Teaser (first few paragraphs) for paid posts. |
contentText | string | Truncated plain text (~200 chars) |
wordCount | number | Word count |
coverImageUrl | string | Cover image URL |
sectionName | string | Newsletter section name |
reactionCount | number | Total reaction count |
reactions | object | Reactions by emoji (e.g. {"❤": 320}) |
commentCount | number | Number of comments |
restackCount | number | Number of restacks |
tags | array | Post tags |
authorName | string | Author display name |
authorHandle | string | Author handle |
authorBio | string | Author bio |
authorImageUrl | string | Author profile image URL |
authorTwitter | string | Author Twitter/X handle |
newsletterName | string | Newsletter name |
newsletterSubdomain | string | Substack subdomain |
newsletterDomain | string | Custom domain (or Substack subdomain) |
newsletterLogoUrl | string | Newsletter logo URL |
newsletterDescription | string | Newsletter tagline |
podcastUrl | string | Podcast audio URL |
podcastDuration | number | Podcast duration in seconds |
ttsAudioUrl | string | AI-generated text-to-speech audio URL |
comments | array | Comments with name, body, date, and reactions |
scrapedAt | string | When the data was scraped (ISO 8601) |
About paid content
By default, this scraper only collects free posts (audience: "everyone"). Paid/paywalled posts are skipped.
If you turn on "Include Paid Posts", the scraper will attempt to fetch paywalled posts too. However, Substack's public API has limitations for paid content:
- Content is partial — you'll get a teaser (first few paragraphs), not the full article. This is a Substack limitation, not ours.
- Some newsletters block paid posts entirely — the API returns HTTP 403 for paid posts on certain newsletters. These posts are skipped automatically.
- Comments may be empty — if comment permissions are set to paid subscribers only, the API returns no comments.
- No TTS audio — text-to-speech audio is not available for paid posts.
You can always check the audience field in the output to see whether a post is free (everyone) or paywalled (only_paid).
Tips
- Custom domains work the same —
www.lennysnewsletter.comandlenny.substack.comboth work. Just paste the URL from your browser. - Use "Not Older Than" to save costs — for recurring scrapes, set a date cutoff so you only fetch new posts.
- Post limit is per newsletter — if you scrape 3 newsletters with a limit of 100, you'll get up to 300 posts total.
- TTS audio — many free Substack posts have AI-generated audio versions. The
ttsAudioUrlfield captures these when available.