Substack Newsletter Scraper - Articles, Metadata & Full Content
Pricing
from $1.00 / 1,000 results
Substack Newsletter Scraper - Articles, Metadata & Full Content
Extract articles, metadata, and content from any Substack newsletter via public API. No proxy needed. Supports multiple newsletters, full article body extraction, audience filtering (free/paid), date range, keyword search, and pagination. Works with both substack.com subdomains and custom domains.
Pricing
from $1.00 / 1,000 results
Rating
0.0
(0)
Developer
Moris Chao
Actor stats
0
Bookmarked
7
Total users
3
Monthly active users
21 days ago
Last modified
Categories
Share
Substack Newsletter Scraper
Scrape articles from any Substack newsletter using the public Substack API. Supports both subdomain.substack.com URLs and custom domains (e.g., www.lennysnewsletter.com).
Features
- Scrape article metadata (title, author, date, reactions, comments, etc.)
- Optionally fetch full article HTML body
- Filter by audience (free/paid), content type, date range, or keyword
- Scrape multiple newsletters in a single run (comma-separated URLs)
- Concurrent batch fetching for full article bodies
- Polite rate limiting (500ms delay between requests)
Input
| Field | Type | Default | Description |
|---|---|---|---|
newsletterUrl | string | required | Newsletter URL(s). Comma-separated for multiple. |
maxItems | integer | 50 | Max articles per newsletter. 0 = unlimited. |
sortBy | string | "new" | "new" (newest first) or "top" (most popular). |
includeBody | boolean | false | Fetch full HTML body for each article. |
audienceFilter | string | "all" | "all", "free", or "paid". |
typeFilter | string | "all" | "all", "newsletter", "podcast", or "thread". |
dateFrom | string | — | Only articles on/after this date (YYYY-MM-DD). |
dateTo | string | — | Only articles on/before this date (YYYY-MM-DD). |
searchKeyword | string | — | Filter by keyword in title or description. |
Example Input
{"newsletterUrl": "https://www.lennysnewsletter.com","maxItems": 10,"sortBy": "new","includeBody": false}
Multiple Newsletters
{"newsletterUrl": "https://www.lennysnewsletter.com, https://stratechery.com","maxItems": 20}
Output
Each article is saved to the default dataset with the following fields:
| Field | Type | Description |
|---|---|---|
id | number | Substack post ID |
title | string | Article title |
subtitle | string | Article subtitle |
slug | string | URL slug |
url | string | Full canonical URL |
postDate | string | ISO 8601 publish date |
audience | string | "everyone" or "only_paid" |
type | string | "newsletter", "podcast", or "thread" |
wordcount | number | Word count |
reactions | object | Reaction counts (e.g., {"❤": 532}) |
commentCount | number | Number of comments |
coverImage | string | Cover image URL |
author | string | Author name(s) |
description | string | Article description/excerpt |
body | string | Full HTML body (only when includeBody: true) |
Example Output
{"id": 123456,"title": "How to build a great product","subtitle": "Lessons from top PMs","slug": "how-to-build-a-great-product","url": "https://www.lennysnewsletter.com/p/how-to-build-a-great-product","postDate": "2026-03-03T13:45:17.054Z","audience": "everyone","type": "newsletter","wordcount": 2642,"reactions": { "❤": 532 },"commentCount": 9,"coverImage": "https://substackcdn.com/image/...","author": "Lenny Rachitsky","description": "A deep dive into product excellence..."}
Notes
- Paid articles: Articles marked as
"only_paid"may only return a preview of the body content. - Rate limiting: The Actor adds a 500ms delay between API requests to avoid overloading Substack servers.
- Custom domains: Both
newsletter.substack.comand custom domains likeplatformer.newsare supported. - No authentication required: Uses Substack's public API endpoints.
Cost
This Actor uses only HTTP API calls (no browser), so it's very lightweight:
- ~256MB memory is sufficient
- No proxy required
- A typical run of 50 articles completes in under a minute