Substack Scraper - Newsletter Posts and Authors
Pricing
Pay per usage
Substack Scraper - Newsletter Posts and Authors
Free Substack scraper. Extract newsletter posts, authors, subscribers at scale. No API key needed. Export JSON, CSV, Excel. Newsletter intelligence and lead gen.
Pricing
Pay per usage
Rating
0.0
(0)
Developer

CryptoSignals Agent
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 hours ago
Last modified
Share
Substack Scraper
Scrape Substack newsletter posts, authors, and publication data at scale. No API key or login needed.
What It Does
- Scrape newsletter posts from any public Substack publication
- Extract post metadata: title, author, date, engagement metrics, paywall status, word count
- Get publication info: name, description, author, logo, custom domain
- Search publications by keyword to discover newsletters
- Pagination built-in: scrape hundreds or thousands of posts automatically
- Export to JSON, CSV, or Excel
Use Cases
- Newsletter Intelligence: Track what topics competitors cover, how often they publish, and which posts get the most engagement
- B2B Lead Generation: Find newsletter authors and publications in your industry for outreach and partnerships
- Content Analysis: Analyze publishing frequency, word counts, and engagement patterns across newsletters
- Competitive Research: Monitor competitor newsletters for content strategy insights
- Journalist Research: Find expert voices and trending topics across Substack's ecosystem
- Market Research: Discover which newsletter niches have the most engagement and subscriber interest
Input Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
publications | Array of strings | [] | Substack publication subdomains (e.g., ["platformer", "thebrowser"]) |
searchQuery | String | "" | Search for publications by keyword |
scrapeType | String | "posts" | What to scrape: "posts", "publications", or "both" |
maxItems | Integer | 50 | Max items per publication (0 = unlimited) |
sortBy | String | "new" | Sort posts: "new" or "top" |
includeBodyText | Boolean | false | Include body text excerpt in output |
Example Inputs
Scrape Posts from Multiple Publications
{"publications": ["platformer", "thebrowser", "slow-boring"],"scrapeType": "posts","maxItems": 25,"sortBy": "new"}
Get Publication Info
{"publications": ["noahpinion", "astralcodexten"],"scrapeType": "publications"}
Scrape Posts with Body Text
{"publications": ["stratechery"],"scrapeType": "both","maxItems": 100,"includeBodyText": true}
Search for AI Newsletters
{"searchQuery": "AI","maxItems": 10}
Output
Post Fields
| Field | Type | Description |
|---|---|---|
type | String | Always "post" |
title | String | Post title |
subtitle | String | Post subtitle |
slug | String | URL slug |
postUrl | String | Full URL to the post |
authorName | String | Author's display name |
publicationName | String | Publication name |
publicationUrl | String | Publication URL |
publishDate | String | ISO 8601 publish date |
description | String | Post description/excerpt |
likeCount | Integer | Total reactions/likes |
restackCount | Integer | Number of restacks |
commentCount | Integer | Total comments (including replies) |
wordCount | Integer | Word count |
isPaid | Boolean | Whether the post is behind a paywall |
audience | String | "everyone", "only_paid", etc. |
coverImage | String | Cover image URL |
canonicalUrl | String | Canonical URL |
postType | String | "newsletter", "podcast", etc. |
tags | Array | Post tags |
bodyTextExcerpt | String | Truncated body text (if includeBodyText is true) |
Publication Fields
| Field | Type | Description |
|---|---|---|
type | String | Always "publication" |
name | String | Publication name |
subdomain | String | Substack subdomain |
description | String | Publication description |
authorName | String | Primary author name |
authorBio | String | Author biography |
logoUrl | String | Publication logo URL |
heroImageUrl | String | Hero/banner image URL |
publicationUrl | String | Substack URL |
customDomainUrl | String | Custom domain URL (if set) |
communityEnabled | Boolean | Whether community features are on |
paymentsState | String | Payment status |
twitterHandle | String | Author's Twitter/X handle |
createdAt | String | Publication creation date |
API Integration
JavaScript
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });const run = await client.actor('cryptosignals/substack-scraper').call({publications: ['platformer', 'thebrowser'],scrapeType: 'posts',maxItems: 50,});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(items);
Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")run = client.actor("cryptosignals/substack-scraper").call(run_input={"publications": ["platformer", "thebrowser"],"scrapeType": "posts","maxItems": 50,})items = list(client.dataset(run["defaultDatasetId"]).iterate_items())for item in items:print(f"{item['title']} - {item['likeCount']} likes")
cURL
curl "https://api.apify.com/v2/acts/cryptosignals~substack-scraper/runs?token=YOUR_API_TOKEN" \-X POST \-H "Content-Type: application/json" \-d '{"publications": ["platformer"],"scrapeType": "posts","maxItems": 10}'
FAQ
Does this need a Substack account or API key? No. This scraper uses Substack's public API endpoints. No authentication required.
Can it scrape paywalled content?
No. Paywalled posts are flagged with isPaid: true, but the full content is not accessible. You'll get the title, metadata, and public excerpt.
How do I find a publication's subdomain?
Look at the URL: https://platformer.substack.com -> subdomain is platformer. For custom domains, check the publication's Substack URL.
What about publications with custom domains? If a publication has moved to a custom domain, you still use the original Substack subdomain. The scraper will return both the Substack URL and custom domain URL.
Is there a rate limit? The scraper includes built-in rate limiting and retries. It's designed to be respectful of Substack's servers.
How many posts can I scrape?
There's no hard limit. Set maxItems to control how many posts per publication. Set to 0 for unlimited.
Cost
This actor runs on the Apify platform. Typical runs:
- 50 posts from 1 publication: ~5 seconds, minimal compute
- 500 posts from 5 publications: ~30 seconds
- Pricing details coming soon
Legal Notice
This scraper accesses publicly available data through Substack's public API. Users are responsible for complying with Substack's terms of service and applicable laws. Do not use scraped data for spam, harassment, or any illegal purpose.