Substack Scraper — Newsletters, Posts & Creator Leads avatar

Substack Scraper — Newsletters, Posts & Creator Leads

Pricing

from $4.00 / 1,000 publication scrapeds

Go to Apify Store
Substack Scraper — Newsletters, Posts & Creator Leads

Substack Scraper — Newsletters, Posts & Creator Leads

Scrape Substack: search newsletters by keyword, browse category leaderboards, pull full publication profiles (subscribers, paid pricing, podcast), posts, authors and the recommendation network. Turn creators into leads with contact emails. Monitoring mode. No API key, no browser.

Pricing

from $4.00 / 1,000 publication scrapeds

Rating

0.0

(0)

Developer

Scrape Sage

Scrape Sage

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

Substack Scraper — Newsletters, Posts & Creator Leads (Subscribers, Pricing, Emails)

Extract complete Substack data — search newsletters by keyword, browse category leaderboards, and pull the fields other scrapers miss: free-subscriber counts, paid-subscriber tiers, real paid pricing (monthly / yearly / founding), podcast details, the recommendation network, and full author profiles. Optionally turn every creator into a ready-to-contact lead by crawling their own website for contact emails, phone, and socials.

No login, no cookies, no browser — fast first-party JSON extraction with 99%+ reliability.

Why this Substack scraper?

Most Substack scrapers return a thin slice — a title, a date, maybe a subscriber number. This actor reads Substack's own public API and ships the richest dataset in the category, across newsletters, posts and authors in one run:

DataTypical scrapersThis actor
Search by keyword + category leaderboardspartial✅ both
Free subscriber countpartial
Paid-subscriber tier (e.g. "Thousands of paid subscribers")
Real paid pricing — monthly / yearly / founding + currency
Accepts sponsorships (ad-sales signal)
Podcast title / description / flags
Recommendation network (who recommends whom)✅ opt-in
Posts — reactions, restacks, comments, word countpartial✅ opt-in
Full post content (HTML + plain text)✅ opt-in
Author profiles — followers, bio, external links, all publications✅ opt-in
Creator contact emails (from their website)✅ opt-in
Lead score (0–100) per newsletter
No start fee❌ many charge per run✅ pay per result only

Use cases

  • Creator & newsletter lead generation — Substack creators are active buyers and sellers: they want tools, sponsors, cross-promotion, and ghostwriters. Score them by audience (freeSubscriberCount, paidSubscriberTier) and reach them directly (supportEmail, contactEmails).
  • Sponsorship & ad-sales prospecting — find paid newsletters that acceptsSponsorships, ranked by subscriber tier and niche, with contact data attached.
  • Market & competitor research — track category leaderboards, paid pricing, posting cadence, and engagement (reactions, restacks, comments) across any topic.
  • Content & trend analysis — pull posts with full content for summarization, RAG, sentiment, and topic modeling.
  • Influencer / partnership discovery — map the recommendation network to find who the top newsletters endorse.

How to use

  1. Sign up for Apify — the free plan is enough to try this actor.
  2. Open the Substack Scraper, enter search queries and/or categories (or paste Substack URLs), and click Start.
  3. Watch results stream into the dataset table.
  4. Export as JSON, CSV, Excel, XML, or RSS — or pull results programmatically via the Apify API.

Input

{
"searchQueries": ["artificial intelligence"],
"categories": ["Technology", "Business"],
"maxPublications": 200,
"includePosts": true,
"maxPostsPerPublication": 20,
"includeRecommendations": true,
"includeAuthorProfiles": true,
"enrichContactEmails": true,
"onlyPaidPublications": false,
"minFreeSubscribers": 1000
}
  • searchQueries — keywords to search publications (each returns full newsletter profiles).
  • categories — category leaderboards by name (Technology, Business, Finance, Culture, U.S. Politics, Food & Drink, Sports, …) or numeric id.
  • startUrls — direct publication URLs (https://newsletter.substack.com or a custom domain), post URLs (.../p/the-slug), or author profiles (https://substack.com/@handle).
  • maxPublications (default 100) — cap on unique publications from search + categories.
  • includePosts / maxPostsPerPublication / includePostContent — add recent posts, and optionally their full HTML + plain text.
  • includeRecommendations (default false) — add each newsletter's recommendation network as a recommends array.
  • includeAuthorProfiles (default false) — emit one author record per unique creator (followers, bio, links, all publications).
  • enrichContactEmails (default false) — crawl the publication's own website (home + about/contact, max 3 pages) for emails, phone, and extra socials. Substack never exposes emails — this is the only way to get them.
  • onlyPaidPublications / minFreeSubscribers — filters.
  • monitorMode (default false) — emit only publications/posts not seen in previous runs (see below).

Output

One record per newsletter (type: "publication"), plus optional post records (type: "post") and author records (type: "author"):

{
"type": "publication",
"id": 89120,
"name": "Astral Codex Ten",
"subdomain": "astralcodexten",
"url": "https://www.astralcodexten.com",
"customDomain": "www.astralcodexten.com",
"publicationType": "newsletter",
"tagline": "P(A|B) = [P(A)*P(B|A)]/P(B), all the rest is commentary",
"authorName": "Scott Alexander",
"authorHandle": "astralcodexten",
"authorBio": "Psychiatrist, blogger…",
"freeSubscriberCount": 91000,
"paidSubscriberTier": "Thousands of paid subscribers",
"bestsellerTier": 1000,
"isPaid": true,
"currency": "USD",
"monthlyPrice": 10,
"yearlyPrice": 100,
"foundingPrice": 300,
"acceptsSponsorships": false,
"hasPodcast": true,
"supportEmail": "astralcodexten@substack.com",
"website": "https://www.astralcodexten.com",
"contactEmails": ["scott@slatestarcodex.com"],
"contactSocials": { "twitter": "https://twitter.com/slatestarcodex" },
"recommends": [
{ "name": "Slow Boring", "subdomain": "slowboring", "url": "https://www.slowboring.com" }
],
"leadScore": 86,
"category": "Technology",
"searchQuery": "artificial intelligence",
"scrapedAt": "2026-06-14T12:00:00.000Z"
}

Monitoring mode

Turn on monitorMode to make the actor remember every publication and post it has already returned (in a named key-value store) and emit only new ones on the next run. Combine it with Apify Schedules to:

  • watch a category or keyword for newly launched newsletters,
  • alert on new posts from a set of newsletters you track,
  • keep a CRM topped up with fresh creator leads.

Monitoring mode is independent of the scheduler: Schedules decide when a run starts; monitoring decides what counts as new. Use a distinct monitorStoreName per tracked target to keep histories separate.

Automate & schedule

Run this actor on autopilot and pull results into your own stack:

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'MY_APIFY_TOKEN' });
const run = await client.actor('scrapesage/substack-scraper').call({
searchQueries: ['fintech'],
categories: ['Finance'],
maxPublications: 200,
enrichContactEmails: true,
onlyPaidPublications: true,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(`Got ${items.length} newsletters & creator leads`);

Integrate with any app

Connect the dataset to 5,000+ apps — no code required:

  • Make — multi-step automation scenarios.
  • Zapier — push new creator leads straight into your CRM.
  • Slack — get notified when a monitored search finds new newsletters.
  • Google Drive / Sheets — auto-export every run to a spreadsheet.
  • Airbyte — pipe results into your data warehouse.
  • GitHub — trigger runs from commits or releases.

Use with AI assistants (MCP)

The output is clean, LLM-ready JSON. You can call this actor from Claude, ChatGPT, or any agent framework through the Apify MCP server — ask your assistant to "find the top AI newsletters on Substack and list their contact emails" and let it run this scraper for you.

More scrapers from scrapesage

Build a complete creator & event lead-gen stack:

Tips

  • Exhaust a niche: combine searchQueries (keywords) with categories (leaderboards) to cover both long-tail and top newsletters; raise maxPublications.
  • Best leads: set onlyPaidPublications: true + minFreeSubscribers + enrichContactEmails: true to get monetizing creators with real contact data and a high leadScore.
  • Cost control: posts, recommendations, author profiles and email enrichment are all opt-in, so you only pay for what you turn on; email enrichment only runs for publications that actually have a website.
  • Monitoring: combine monitorMode with Schedules to track only new newsletters/posts.

FAQ

How do I scrape the top newsletters in a topic? Put the category name in categories (e.g. Technology, Finance) to pull its leaderboard, and/or add keywords to searchQueries.

Where do the emails come from? Never from Substack (they don't publish creator emails). With enrichContactEmails on, the actor visits the newsletter's own public website and extracts publicly listed contact emails — the same thing a human visitor would see. Many newsletters also expose a supportEmail directly.

Does it expose exact paid-subscriber counts? Substack hides exact paid counts, but publishes a tier band (e.g. "Hundreds/Thousands of paid subscribers") which this actor returns as paidSubscriberTier, plus the exact freeSubscriberCount for most newsletters.

Can I export to Google Sheets, CSV, or Excel? Yes — one click in the dataset view, or automatically on every run via the Google Drive integration.

Is scraping Substack legal? This actor collects publicly available data only. You are responsible for using the data in compliance with applicable laws (GDPR/CCPA for personal data) and Substack's terms.

A field is null — why? Some newsletters genuinely don't publish a price (free-only), a website, or a podcast. Fields are null only when the data doesn't exist, not because the scraper skipped them.

Need help?

Open an issue on the actor's Issues tab, or visit the Apify help center. Feature requests are welcome — this actor is actively maintained.