Substack Newsletter Scraper - Articles, Metadata & Full Content
Pricing
from $1.00 / 1,000 results
Substack Newsletter Scraper - Articles, Metadata & Full Content
Extract articles, metadata, and content from any Substack newsletter via public API. No proxy needed. Supports multiple newsletters, full article body extraction, audience filtering (free/paid), date range, keyword search, and pagination. Works with both substack.com subdomains and custom domains.
Pricing
from $1.00 / 1,000 results
Rating
0.0
(0)
Developer

Moris Chao
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share
Substack Newsletter Scraper
Scrape articles from any Substack newsletter using the public Substack API. Supports both subdomain.substack.com URLs and custom domains (e.g., www.lennysnewsletter.com).
Features
- Scrape article metadata (title, author, date, reactions, comments, etc.)
- Optionally fetch full article HTML body
- Filter by audience (free/paid), content type, date range, or keyword
- Scrape multiple newsletters in a single run (comma-separated URLs)
- Concurrent batch fetching for full article bodies
- Polite rate limiting (500ms delay between requests)
Input
| Field | Type | Default | Description |
|---|---|---|---|
newsletterUrl | string | required | Newsletter URL(s). Comma-separated for multiple. |
maxItems | integer | 50 | Max articles per newsletter. 0 = unlimited. |
sortBy | string | "new" | "new" (newest first) or "top" (most popular). |
includeBody | boolean | false | Fetch full HTML body for each article. |
audienceFilter | string | "all" | "all", "free", or "paid". |
typeFilter | string | "all" | "all", "newsletter", "podcast", or "thread". |
dateFrom | string | — | Only articles on/after this date (YYYY-MM-DD). |
dateTo | string | — | Only articles on/before this date (YYYY-MM-DD). |
searchKeyword | string | — | Filter by keyword in title or description. |
Example Input
{"newsletterUrl": "https://www.lennysnewsletter.com","maxItems": 10,"sortBy": "new","includeBody": false}
Multiple Newsletters
{"newsletterUrl": "https://www.lennysnewsletter.com, https://stratechery.com","maxItems": 20}
Output
Each article is saved to the default dataset with the following fields:
| Field | Type | Description |
|---|---|---|
id | number | Substack post ID |
title | string | Article title |
subtitle | string | Article subtitle |
slug | string | URL slug |
url | string | Full canonical URL |
postDate | string | ISO 8601 publish date |
audience | string | "everyone" or "only_paid" |
type | string | "newsletter", "podcast", or "thread" |
wordcount | number | Word count |
reactions | object | Reaction counts (e.g., {"❤": 532}) |
commentCount | number | Number of comments |
coverImage | string | Cover image URL |
author | string | Author name(s) |
description | string | Article description/excerpt |
body | string | Full HTML body (only when includeBody: true) |
Example Output
{"id": 123456,"title": "How to build a great product","subtitle": "Lessons from top PMs","slug": "how-to-build-a-great-product","url": "https://www.lennysnewsletter.com/p/how-to-build-a-great-product","postDate": "2026-03-03T13:45:17.054Z","audience": "everyone","type": "newsletter","wordcount": 2642,"reactions": { "❤": 532 },"commentCount": 9,"coverImage": "https://substackcdn.com/image/...","author": "Lenny Rachitsky","description": "A deep dive into product excellence..."}
Notes
- Paid articles: Articles marked as
"only_paid"may only return a preview of the body content. - Rate limiting: The Actor adds a 500ms delay between API requests to avoid overloading Substack servers.
- Custom domains: Both
newsletter.substack.comand custom domains likeplatformer.newsare supported. - No authentication required: Uses Substack's public API endpoints.
Cost
This Actor uses only HTTP API calls (no browser), so it's very lightweight:
- ~256MB memory is sufficient
- No proxy required
- A typical run of 50 articles completes in under a minute