Substack Scraper — Posts, Authors & Newsletters
Pricing
Pay per usage
Substack Scraper — Posts, Authors & Newsletters
Scrape Substack newsletters, posts, author profiles, and recommendation networks. Collect subscriber counts, publication data, and full article content. Export to CSV, Excel, JSON. Works with Zapier, Make.com, Python, and JavaScript API. Free until April 3, then $4.99/mo.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
CryptoSignals Agent
Actor stats
0
Bookmarked
13
Total users
7
Monthly active users
2 days ago
Last modified
Categories
Share
Substack Scraper
⚡ Heads Up: Free Trial Ending April 3
This actor has been free to use while we collected feedback. Starting April 3, 2026, it moves to a $4.99/month subscription.
If you've been using it, you'll need to add a payment method at apify.com/billing to keep access. Questions? Leave a comment on the actor page.
Extract newsletter posts, publication metadata, subscriber counts, and recommendation networks from any Substack newsletter. Scrape one publication or hundreds in bulk — with date filtering, keyword search, and topic discovery. No API key or login needed.
Why Substack Scraper?
- No authentication required — works with all public Substack newsletters
- Bulk scraping — scrape multiple publications in a single run
- Date range filtering — only fetch posts within a specific time window
- Keyword filtering — find posts matching specific topics or terms
- Publication metadata — subscriber counts, post counts, author info, social links
- Recommendation networks — discover what newsletters recommend each other
- Topic search — find newsletters by topic across all of Substack
- Respectful scraping — built-in 500ms delays between requests
Use Cases
| # | Use Case | How |
|---|---|---|
| 1 | Newsletter intelligence | Scrape competitor newsletters to track content strategy and posting frequency |
| 2 | Content research | Search posts by keyword to find expert takes on any topic |
| 3 | Lead generation | Extract author and publication data for outreach campaigns |
| 4 | Trend monitoring | Track publication growth via subscriber counts over time |
| 5 | Audience research | Analyze recommendation networks to map newsletter ecosystems |
| 6 | Media monitoring | Monitor specific Substack publications for mentions or topics |
| 7 | Academic research | Collect newsletter data for media studies and content analysis |
| 8 | Investment research | Track finance and crypto newsletters for market insights |
| 9 | Competitive analysis | Compare engagement metrics (reactions, comments, restacks) across publications |
| 10 | Content curation | Aggregate top posts from multiple newsletters into one dataset |
Quick Start
1. Scrape Recent Posts from a Newsletter
{"publications": ["platformer"],"scrapeType": "posts","maxItems": 100,"sortBy": "new"}
2. Scrape Posts from a Date Range
{"publications": ["stratechery", "platformer"],"scrapeType": "posts","maxItems": 200,"dateFrom": "2026-02-01","dateTo": "2026-03-01"}
3. Find AI-Related Posts
{"publications": ["importai", "thealgorithmicbridge"],"scrapeType": "posts","filterKeyword": "GPT","maxItems": 50}
4. Get Publication Metadata + Subscriber Counts
{"publications": ["platformer", "stratechery", "seths"],"scrapeType": "publications"}
5. Discover Newsletter Recommendations
{"publications": ["platformer"],"scrapeType": "recommendations"}
6. Search for Newsletters by Topic
{"searchQuery": "machine learning","scrapeType": "publications","maxItems": 20}
Input Parameters
| Field | Type | Default | Description |
|---|---|---|---|
| publications | string[] | ["platformer"] | Substack subdomains or URLs to scrape |
| searchQuery | string | — | Search for publications by keyword/topic |
| scrapeType | string | "posts" | posts, publications, both, recommendations |
| maxItems | integer | 50 | Max items per publication |
| sortBy | string | "new" | Sort posts: new (latest) or top (most engagement) |
| includeBodyText | boolean | false | Include truncated post body text in output |
| dateFrom | string | — | Posts on/after this date (YYYY-MM-DD) |
| dateTo | string | — | Posts on/before this date (YYYY-MM-DD) |
| filterKeyword | string | — | Only return posts matching this keyword in title/description |
Output Format
Post Record
{"id": 123456,"title": "The Future of AI Newsletters","subtitle": "Why every creator needs a Substack","description": "A deep dive into the newsletter economy","url": "https://platformer.news/p/the-future-of-ai-newsletters","slug": "the-future-of-ai-newsletters","publishedAt": "2026-01-15T12:00:00.000Z","type": "newsletter","audienceType": "everyone","isFree": true,"reactionCount": 142,"commentCount": 37,"restacks": 22,"coverImage": "https://...","authorName": "Casey Newton","subdomain": "platformer"}
Publication Record
{"subdomain": "platformer","name": "Platformer","description": "The intersection of Silicon Valley and democracy","url": "https://platformer.news","logoUrl": "https://...","authorName": "Casey Newton","subscriberCount": 50000,"postCount": 800,"language": "en","twitterScreenName": "platformer","type": "publication"}
Recommendation Record
{"subdomain": "stratechery","name": "Stratechery","description": "Analysis of strategy and business side of technology","url": "https://stratechery.com","recommenderSubdomain": "platformer","type": "recommendation"}
Code Examples
Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")run = client.actor("cryptosignals/substack-scraper").call(run_input={"publications": ["platformer", "stratechery"],"scrapeType": "posts","maxItems": 100,"sortBy": "new","dateFrom": "2026-03-01",})for item in client.dataset(run["defaultDatasetId"]).iterate_items():print(f"[{item['subdomain']}] {item['title']} — {item['reactionCount']} reactions")
JavaScript
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });const run = await client.actor('cryptosignals/substack-scraper').call({publications: ['platformer', 'stratechery'],scrapeType: 'posts',maxItems: 100,sortBy: 'new',dateFrom: '2026-03-01',});const { items } = await client.dataset(run.defaultDatasetId).listItems();items.forEach(item => {console.log(`[${item.subdomain}] ${item.title} — ${item.reactionCount} reactions`);});
cURL
# Start the actor runcurl -X POST "https://api.apify.com/v2/acts/cryptosignals~substack-scraper/runs?token=YOUR_API_TOKEN" \-H "Content-Type: application/json" \-d '{"publications": ["platformer"], "scrapeType": "posts", "maxItems": 50}'# Get results (replace DATASET_ID from run response)curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"
Integrations
Google Sheets
Export Substack data directly to Google Sheets using Apify's Google Sheets integration. Track newsletter metrics over time with scheduled daily runs.
Zapier
Connect to 5,000+ apps via Zapier. Example workflow: scrape newsletters daily, filter new posts, send digest to Slack or email.
Make (Integromat)
Use the Apify module for Make to automate newsletter monitoring — trigger scrapes, process data, route to CRMs or databases.
Webhooks
Configure a webhook URL to receive results automatically when the scrape completes. Build real-time newsletter monitoring pipelines.
API
Full REST API access. Export results in JSON, CSV, XML, Excel, or HTML format. Perfect for building custom newsletter analytics dashboards.
Frequently Asked Questions
Q: Do I need a Substack account? A: No. This scraper extracts publicly available data only. No login or API key required.
Q: Can I scrape paid/premium post content? A: No. Only free post metadata (title, description, engagement metrics) is available. Premium post content behind the paywall is not scraped.
Q: How do I scrape a newsletter with a custom domain?
A: Use the Substack subdomain, not the custom domain. For example, use "platformer" (not "platformer.news").
Q: Can I scrape multiple newsletters at once?
A: Yes. Pass multiple subdomains in the publications array. The scraper processes them sequentially with rate limiting.
Q: How far back can I get posts?
A: All publicly listed posts are available. Use dateFrom and dateTo to narrow results to your target date range.
Q: What are "restacks"? A: Restacks are Substack's equivalent of retweets/reposts — when a writer shares another writer's post with their audience.
Q: How do recommendation networks work?
A: Set scrapeType: "recommendations" to see what other newsletters a publication recommends. This maps the Substack ecosystem and shows who promotes whom.
Q: Is the subscriber count accurate? A: Subscriber counts come from Substack's public API. Some publications may hide this data, in which case the field will be null.
Q: Can I search across all of Substack?
A: Yes. Use searchQuery with scrapeType: "publications" to find newsletters on any topic across the entire Substack platform.
Q: How often should I schedule runs?
A: Most newsletters publish daily or weekly. A daily scheduled run with dateFrom set to yesterday captures new posts without duplicates.
Legal & Compliance
This scraper accesses only publicly available Substack data through public web endpoints. It does not log in to any account, access paywalled content, or bypass any access controls. It respects Substack's servers with built-in rate limiting (500ms delays between requests). Users are responsible for complying with Substack's Terms of Service and all applicable laws when using scraped data.
Related Scrapers
- Bluesky Scraper — Extract posts, profiles, and threads from Bluesky
- Hacker News Scraper — Stories, comments, and user profiles from HN
Notes
- Respects Substack's servers with 500ms delays between requests
- Gracefully handles missing endpoints (404s do not fail the run)
- Works with custom domain newsletters (input the subdomain, not the custom domain)
- Free posts and metadata always available; premium content is not scraped
Support
Questions or issues? Open an issue on the actor page or check the Input tab for parameter details.