Substack Newsletter Scraper
Pricing
from $0.01 / 1,000 results
Substack Newsletter Scraper
Substack-Newsletter-Scraper Extract complete newsletter archives from any Substack publication with advanced filtering, multiple export formats, and engagement analytics. ## Features - Scrape entire newsletter archives from Substack - Extract full metadata: titles, content, author details etc.
Pricing
from $0.01 / 1,000 results
Rating
0.0
(0)
Developer

Aryan
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
4 days ago
Last modified
Categories
Share
Substack-Newsletter-Scraper
Extract complete newsletter archives from any Substack publication with advanced filtering, multiple export formats, and engagement analytics.
Features
- Scrape entire newsletter archives from Substack
- Extract full metadata: titles, content, author details, engagement metrics, cover images, podcasts
- Advanced filtering by keywords, date range, author names, and paywall status
- Multiple output formats: HTML, Markdown, or plain text
- Optional comment extraction with configurable limits
- Batch processing for multiple newsletters
- Fast and efficient HTTP-based scraping
Use Cases
- Archive newsletters for offline reading
- Build datasets for research and analysis
- Monitor newsletter trends and performance
- Export content to note-taking apps (Notion, Obsidian)
- Analyze author engagement and content metrics
Configuration
| Parameter | Type | Description |
|---|---|---|
startUrls | array | List of Substack newsletter URLs |
maxItems | number | Maximum posts to scrape (default: 100) |
filterKeywords | array | Only include posts with these keywords in title |
filterAfterDate | string | Only posts after this date (YYYY-MM-DD) |
filterBeforeDate | string | Only posts before this date (YYYY-MM-DD) |
filterAuthors | array | Only include posts by these authors |
filterPaidOnly | boolean | Only scrape paid content |
filterFreeOnly | boolean | Only scrape free content |
includeComments | boolean | Extract top comments for each post |
maxCommentsPerPost | number | Maximum comments per post (default: 50) |
outputFormat | string | Content format: html, markdown, or text |
Output
The actor exports posts to Apify's default dataset with the following fields:
- Post URL, title, subtitle, and description
- Author name and detailed profile information
- Published and updated timestamps
- Full content in selected format
- Engagement metrics (likes, comments, word count)
- Cover images and podcast information
- Paywall status and content section
- Scraping timestamp
Data can be downloaded as JSON, CSV, or Excel.
FAQ
Q: Can I scrape paid content? A: Public previews are extracted. Full paid content requires appropriate access.
Q: How do I use Markdown format?
A: Set outputFormat to "markdown" for Notion/Obsidian compatibility.
Q: Can I schedule automatic runs? A: Yes, use Apify's built-in scheduler for recurring scrapes.
Q: Does it work with custom domains? A: Yes, supports both substack.com and custom Substack domains.