Substack Scraper — Posts, Authors & Newsletters avatar

Substack Scraper — Posts, Authors & Newsletters

Pricing

Pay per usage

Go to Apify Store
Substack Scraper — Posts, Authors & Newsletters

Substack Scraper — Posts, Authors & Newsletters

Scrape Substack newsletters, posts, author profiles, and recommendation networks. Collect subscriber counts, publication data, and full article content. Export to CSV, Excel, JSON. Works with Zapier, Make.com, Python, and JavaScript API. Free until April 3, then $4.99/mo.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

CryptoSignals Agent

CryptoSignals Agent

Maintained by Community

Actor stats

0

Bookmarked

13

Total users

7

Monthly active users

2 days ago

Last modified

Share

Substack Scraper


⚡ Heads Up: Free Trial Ending April 3

This actor has been free to use while we collected feedback. Starting April 3, 2026, it moves to a $4.99/month subscription.

If you've been using it, you'll need to add a payment method at apify.com/billing to keep access. Questions? Leave a comment on the actor page.


Extract newsletter posts, publication metadata, subscriber counts, and recommendation networks from any Substack newsletter. Scrape one publication or hundreds in bulk — with date filtering, keyword search, and topic discovery. No API key or login needed.

Why Substack Scraper?

  • No authentication required — works with all public Substack newsletters
  • Bulk scraping — scrape multiple publications in a single run
  • Date range filtering — only fetch posts within a specific time window
  • Keyword filtering — find posts matching specific topics or terms
  • Publication metadata — subscriber counts, post counts, author info, social links
  • Recommendation networks — discover what newsletters recommend each other
  • Topic search — find newsletters by topic across all of Substack
  • Respectful scraping — built-in 500ms delays between requests

Use Cases

#Use CaseHow
1Newsletter intelligenceScrape competitor newsletters to track content strategy and posting frequency
2Content researchSearch posts by keyword to find expert takes on any topic
3Lead generationExtract author and publication data for outreach campaigns
4Trend monitoringTrack publication growth via subscriber counts over time
5Audience researchAnalyze recommendation networks to map newsletter ecosystems
6Media monitoringMonitor specific Substack publications for mentions or topics
7Academic researchCollect newsletter data for media studies and content analysis
8Investment researchTrack finance and crypto newsletters for market insights
9Competitive analysisCompare engagement metrics (reactions, comments, restacks) across publications
10Content curationAggregate top posts from multiple newsletters into one dataset

Quick Start

1. Scrape Recent Posts from a Newsletter

{
"publications": ["platformer"],
"scrapeType": "posts",
"maxItems": 100,
"sortBy": "new"
}

2. Scrape Posts from a Date Range

{
"publications": ["stratechery", "platformer"],
"scrapeType": "posts",
"maxItems": 200,
"dateFrom": "2026-02-01",
"dateTo": "2026-03-01"
}
{
"publications": ["importai", "thealgorithmicbridge"],
"scrapeType": "posts",
"filterKeyword": "GPT",
"maxItems": 50
}

4. Get Publication Metadata + Subscriber Counts

{
"publications": ["platformer", "stratechery", "seths"],
"scrapeType": "publications"
}

5. Discover Newsletter Recommendations

{
"publications": ["platformer"],
"scrapeType": "recommendations"
}

6. Search for Newsletters by Topic

{
"searchQuery": "machine learning",
"scrapeType": "publications",
"maxItems": 20
}

Input Parameters

FieldTypeDefaultDescription
publicationsstring[]["platformer"]Substack subdomains or URLs to scrape
searchQuerystringSearch for publications by keyword/topic
scrapeTypestring"posts"posts, publications, both, recommendations
maxItemsinteger50Max items per publication
sortBystring"new"Sort posts: new (latest) or top (most engagement)
includeBodyTextbooleanfalseInclude truncated post body text in output
dateFromstringPosts on/after this date (YYYY-MM-DD)
dateTostringPosts on/before this date (YYYY-MM-DD)
filterKeywordstringOnly return posts matching this keyword in title/description

Output Format

Post Record

{
"id": 123456,
"title": "The Future of AI Newsletters",
"subtitle": "Why every creator needs a Substack",
"description": "A deep dive into the newsletter economy",
"url": "https://platformer.news/p/the-future-of-ai-newsletters",
"slug": "the-future-of-ai-newsletters",
"publishedAt": "2026-01-15T12:00:00.000Z",
"type": "newsletter",
"audienceType": "everyone",
"isFree": true,
"reactionCount": 142,
"commentCount": 37,
"restacks": 22,
"coverImage": "https://...",
"authorName": "Casey Newton",
"subdomain": "platformer"
}

Publication Record

{
"subdomain": "platformer",
"name": "Platformer",
"description": "The intersection of Silicon Valley and democracy",
"url": "https://platformer.news",
"logoUrl": "https://...",
"authorName": "Casey Newton",
"subscriberCount": 50000,
"postCount": 800,
"language": "en",
"twitterScreenName": "platformer",
"type": "publication"
}

Recommendation Record

{
"subdomain": "stratechery",
"name": "Stratechery",
"description": "Analysis of strategy and business side of technology",
"url": "https://stratechery.com",
"recommenderSubdomain": "platformer",
"type": "recommendation"
}

Code Examples

Python

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("cryptosignals/substack-scraper").call(run_input={
"publications": ["platformer", "stratechery"],
"scrapeType": "posts",
"maxItems": 100,
"sortBy": "new",
"dateFrom": "2026-03-01",
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"[{item['subdomain']}] {item['title']}{item['reactionCount']} reactions")

JavaScript

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('cryptosignals/substack-scraper').call({
publications: ['platformer', 'stratechery'],
scrapeType: 'posts',
maxItems: 100,
sortBy: 'new',
dateFrom: '2026-03-01',
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach(item => {
console.log(`[${item.subdomain}] ${item.title}${item.reactionCount} reactions`);
});

cURL

# Start the actor run
curl -X POST "https://api.apify.com/v2/acts/cryptosignals~substack-scraper/runs?token=YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{"publications": ["platformer"], "scrapeType": "posts", "maxItems": 50}'
# Get results (replace DATASET_ID from run response)
curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"

Integrations

Google Sheets

Export Substack data directly to Google Sheets using Apify's Google Sheets integration. Track newsletter metrics over time with scheduled daily runs.

Zapier

Connect to 5,000+ apps via Zapier. Example workflow: scrape newsletters daily, filter new posts, send digest to Slack or email.

Make (Integromat)

Use the Apify module for Make to automate newsletter monitoring — trigger scrapes, process data, route to CRMs or databases.

Webhooks

Configure a webhook URL to receive results automatically when the scrape completes. Build real-time newsletter monitoring pipelines.

API

Full REST API access. Export results in JSON, CSV, XML, Excel, or HTML format. Perfect for building custom newsletter analytics dashboards.


Frequently Asked Questions

Q: Do I need a Substack account? A: No. This scraper extracts publicly available data only. No login or API key required.

Q: Can I scrape paid/premium post content? A: No. Only free post metadata (title, description, engagement metrics) is available. Premium post content behind the paywall is not scraped.

Q: How do I scrape a newsletter with a custom domain? A: Use the Substack subdomain, not the custom domain. For example, use "platformer" (not "platformer.news").

Q: Can I scrape multiple newsletters at once? A: Yes. Pass multiple subdomains in the publications array. The scraper processes them sequentially with rate limiting.

Q: How far back can I get posts? A: All publicly listed posts are available. Use dateFrom and dateTo to narrow results to your target date range.

Q: What are "restacks"? A: Restacks are Substack's equivalent of retweets/reposts — when a writer shares another writer's post with their audience.

Q: How do recommendation networks work? A: Set scrapeType: "recommendations" to see what other newsletters a publication recommends. This maps the Substack ecosystem and shows who promotes whom.

Q: Is the subscriber count accurate? A: Subscriber counts come from Substack's public API. Some publications may hide this data, in which case the field will be null.

Q: Can I search across all of Substack? A: Yes. Use searchQuery with scrapeType: "publications" to find newsletters on any topic across the entire Substack platform.

Q: How often should I schedule runs? A: Most newsletters publish daily or weekly. A daily scheduled run with dateFrom set to yesterday captures new posts without duplicates.


This scraper accesses only publicly available Substack data through public web endpoints. It does not log in to any account, access paywalled content, or bypass any access controls. It respects Substack's servers with built-in rate limiting (500ms delays between requests). Users are responsible for complying with Substack's Terms of Service and all applicable laws when using scraped data.



Notes

  • Respects Substack's servers with 500ms delays between requests
  • Gracefully handles missing endpoints (404s do not fail the run)
  • Works with custom domain newsletters (input the subdomain, not the custom domain)
  • Free posts and metadata always available; premium content is not scraped

Support

Questions or issues? Open an issue on the actor page or check the Input tab for parameter details.