Pricing

from $2.50 / 1,000 substack post saveds

Substack Posts Scraper for Newsletter Research

Scrape Substack posts, authors, publication names, dates, excerpts, URLs, and metadata for newsletter research, creator tracking, content monitoring, and AI agents.

Pricing

from $2.50 / 1,000 substack post saveds

Rating

0.0

(0)

Developer

Skootle

Actor stats

Bookmarked

Total users

Monthly active users

4 days ago

Last modified

Fast answer: what this Actor is for

Track Substack newsletters and posts as structured research data: titles, authors, dates, excerpts, URLs, and publication metadata.

Run it from the Apify UI for one-off exports.
Schedule it or call it by API for recurring monitoring.
Use the dataset output directly in spreadsheets, automations, and AI agents.

Substack Posts Monitor hero

TL;DR

Newsletter trend analysts, competitive-intelligence teams, and AI training-data curators bookmark 30 Substacks and forget to check them. This pulls every recent post from any list of Substack publications (subdomain or custom domain) into clean structured JSON: title, author, typed isPaywalled boolean, publish timestamp in ISO 8601, wordcount, reaction and comment counts, optional full body for LLM summarization. Watchlist mode emits only posts new since the last run, so a daily schedule feeds your dashboard, Notion sync, or agent pipeline with zero duplicates and zero HTML parsing. Export, run via API, schedule, or integrate with other tools.

Try it on a small dataset, then let us know what you think in a review.

What does Substack Posts Monitor do?

Give it a list of Substack publications and it returns every recent post as a structured JSON record. Each record includes the post title, subtitle, slug, canonical URL, publication name, author name and handle, publish timestamp in ISO 8601, a typed isPaywalled boolean, raw audience value (everyone, only_paid, or founding), wordcount, estimated reading time in minutes, cover image URL, reaction count, comment count, post tags, and an agentMarkdown field that drops straight into Claude, GPT, Slack, or a Notion card.

Set includeFullBody: true and the actor adds the full HTML body and a stripped plain-text version to each record for downstream summarization or full-text search. Leave it off and the actor returns the publicly available short description plus the truncated free preview that Substack shows non-subscribers.

The actor talks to Substack's public archive API directly, no headless browser, no HTML parsing, no scraping fragility, so runs are fast (seconds per publication) and your downstream pipeline never has to deal with HTML drift.

Pass a bare subdomain (thedailyloop) for publications that live on substack.com, or a full URL (https://www.lennysnewsletter.com) for publications on a custom domain. Mix freely in one run.

Why scrape Substack?

Substack now hosts the working notebooks of operators and writers across every niche that matters: product (Lenny's Newsletter), tech strategy (Stratechery), AI (Latent Space), VC (The Generalist), finance (Doomberg), and thousands of vertical newsletters. Newsletter trend analysts, competitive-intel teams tracking competitor cadence and paywall mix, and AI training-data curators all need a way to watch 20+ publications without opening 20 tabs.

Substack offers no feed product for non-subscribers and no one-call API to monitor many publications at once. Bookmark 30 Substacks, forget to check half of them, and miss the post that mattered. One scheduled run replaces the manual ritual and feeds a digest, Notion sync, or agent pipeline.

Who needs this?

Newsletter aggregators building daily or weekly digests from 20+ Substack publications
AI agent builders wiring LLM newsletter summarizers, podcast notes, or topic monitors
Content marketers tracking competitor newsletter cadence, paywall mix, and engagement signals
VC analysts scouting operator-writers in their thesis areas (founder content as a deal-flow signal)
Journalists covering the creator economy or any vertical Substack dominates
Market researchers in tech, finance, AI, or biotech building literature reviews from Substack-hosted commentary
PR teams monitoring brand mentions and influencer commentary across niche newsletters
Subscriber-acquisition tools building lookalike-publication recommendations

How to use Substack Posts Monitor

Open the actor on Apify Console.
Paste publication identifiers into the Publications field. Use bare subdomains (thedailyloop) for substack.com-hosted publications, full URLs (https://www.lennysnewsletter.com) for custom-domain publications.
Set Posts per publication (default 20).
Optional: set a Published after ISO date, change the Audience filter, toggle Include full body, or enable Watchlist mode.
Click Start and check the dataset when the run completes.
Schedule the actor or call it via the Apify API to run on a cadence.

How much will scraping Substack cost?

Pricing is pay-per-event, no monthly minimum, no platform-cost surprises.

Plan	Price per saved post	Posts on $5 free credit
FREE	$0.004	~1,250
BRONZE	$0.0035	n/a (paid plans have separate credit)
SILVER	$0.003
GOLD	$0.0025
PLATINUM	$0.002
DIAMOND	$0.002

Plus a one-time $0.005 actor-start charge per run. A typical daily-monitor run that pulls 20 new posts across 5 publications costs roughly $0.005 + 20 × $0.004 = $0.085 on the FREE tier. Turning on includeFullBody adds one extra request per post but does not change the per-record price.

Is it legal to scrape Substack?

Substack's robots.txt does not block the archive endpoints this actor uses. The actor only fetches publicly viewable post metadata that any logged-out browser visitor can see. It does not bypass the paywall, for paywalled posts, only the public preview text is available. It honors a polite request rate. As with any data you collect from third-party sources, consult your legal counsel before commercial redistribution. Always respect publication terms of use and applicable copyright law.

Examples

Daily monitor of one publication:

{
  "publications": ["thedailyloop"],
  "postsPerPublication": 5,
  "watchlistMode": true
}

Backfill the last 6 months across multiple publications, free posts only:

{
  "publications": [
    "https://www.lennysnewsletter.com",
    "https://stratechery.com",
    "thedailyloop"
  ],
  "postsPerPublication": 50,
  "publishedAfter": "2025-11-10",
  "audience": "free"
}

Pull full bodies for a small batch (for LLM summarization):

{
  "publications": ["thedailyloop"],
  "postsPerPublication": 10,
  "includeFullBody": true
}

Input parameters

Field	Type	Description
`publications`	array of string	Required. Substack subdomains or full publication URLs.
`postsPerPublication`	integer	Default 20. Max 1000 per publication.
`publishedAfter`	string	Optional. ISO date (YYYY-MM-DD). Stops walking the archive past this date.
`audience`	enum	`all` (default), `free`, or `paid_only`.
`includeFullBody`	boolean	Default false. Adds full HTML + stripped text body per post.
`watchlistMode`	boolean	Default false. Emits only posts new since the previous run.
`maxItems`	integer	Default 10. Hard cap across all publications. Raise for production runs.
`proxyConfiguration`	object	Optional. Apify proxy config for high-volume runs.

Substack output format

`substack_post`

Field	Type	Description
`recordType`	string	Always `substack_post`.
`outputSchemaVersion`	string	`2026-05-10`. Bumped on breaking schema changes.
`postId`	string	Substack's stable numeric post ID.
`recordId`	string	`substack:post:<postId>`, idempotent across runs.
`publication`	string	Publication subdomain.
`publicationName`	string	Human-readable publication name.
`url`	string	Canonical post URL.
`title`	string
`subtitle`	string\|null
`slug`	string
`type`	string	`newsletter`, `podcast`, `thread`, etc.
`author`	object	`{ name, handle, photoUrl }`.
`publishedAt`	string	ISO 8601.
`isPaywalled`	boolean	True if `audience` is `only_paid` or `founding`.
`audience`	string	Raw Substack audience value.
`description`	string\|null	Short blurb.
`excerpt`	string\|null	First ~500 chars of body.
`fullBodyHtml`	string\|null	Only when `includeFullBody=true`.
`fullBodyText`	string\|null	HTML-stripped plain text.
`wordcount`	integer\|null
`estimatedReadMinutes`	integer\|null	wordcount / 200.
`coverImageUrl`	string\|null
`reactionsCount`	integer
`commentsCount`	integer
`tags`	array of string
`agentMarkdown`	string	Pre-formatted markdown card for LLM context.
`fieldCompletenessScore`	number	0.0 to 1.0. Filter on this for high-quality records.
`scrapedAt`	string	ISO 8601 of this run.

Sample agentMarkdown:

📰 How to build an AI monetization strategy that actually works
✍️ Vikas Kansal · lennysnewsletter
📅 2026-05-05 · 15 min read · 💰 PAID
👍 283 reactions · 💬 6 comments
🔗 https://www.lennysnewsletter.com/p/why-saas-freemium-playbooks-dont

During the actor run

No authentication needed. A 5-post run on one publication completes in under 10 seconds; a 100-post run across 5 publications usually finishes in under 60 seconds. The actor honors a polite request cadence so publications stay reachable.

A run summary lands at the OUTPUT key, and a top-5 most-engaged digest at AGENT_BRIEFING.md, ready to drop into a Slack channel or daily LLM context window.

FAQ

How is this different from the existing Substack scrapers on the Store?

The leading Substack actor today is easyapi/substack-posts-scraper. It is rated 1.86 stars at the time of writing and ships generic keyword-search positioning. We built this actor specifically to fix the things that drove its rating down: a typed isPaywalled boolean (so paywalled and free posts are easy to filter without parsing the audience string), ISO 8601 timestamps everywhere (no mixing date and datetime formats), watchlist diff mode (so daily schedules emit only new posts), agent-grade fields (agentMarkdown, fieldCompletenessScore, recordId), pay-per-event pricing without a flat monthly fee, and an explicit per-publication base-URL strategy that handles both substack.com subdomains and custom domains in the same run.

Can I bulk-scrape paywalled posts?

No. The actor only returns what Substack returns publicly. For paywalled posts that means metadata, the preview text, and the truncated free preview body, never the full subscriber-only content. There is no bypassPaywall option and one will not be added.

What about authors who block scraping?

Substack's robots.txt is permissive on the archive endpoints this actor uses. We honor a polite request cadence and do not retry aggressively. If a specific publication has explicit terms of use that prohibit automated retrieval of their content, you should not scrape that publication.

Can I monitor only new posts?

Yes. Set watchlistMode: true. The actor stores post IDs it has already emitted in the key-value store and skips them on subsequent runs. The state is per-actor-run-storage, so duplicate-prevention works automatically when you schedule the actor.

Can I use this with Python, JavaScript, n8n, Make, or Zapier?

Yes. Apify has integrations for all of those plus a REST API. The dataset returned by this actor is plain JSON; pull it into any tool that consumes JSON.

Why does this cost more than free Substack scrapers?

Free actors break when source HTML or APIs change and there is no notification when they go silent. This actor uses Substack's official archive API, ships a versioned schema with recordId for idempotent upserts, and has watchlist diff mode built in so your scheduled runs do not re-emit posts you already have. If you are feeding this into a customer-facing product or a daily AI agent, the pennies per record buy you reliability the free actors cannot deliver.

How fast is it?

A 5-post run on one publication typically completes in under 10 seconds. A 100-post run across 5 publications with includeFullBody=false typically completes in under 60 seconds.

What happens with custom-domain publications?

Pass the full URL (e.g., https://www.lennysnewsletter.com) instead of a bare subdomain. The actor calls the API on whatever host you provide. The error message will tell you if you should switch when a subdomain redirects.

Why choose Substack Posts Monitor

Monitor mode emits only what's new since last run, tracks seen post IDs across runs, so your competitor-cadence dashboard ingests each post exactly once
Reliability free Substack scrapers can't deliver, the leading free actor sits at 1.86 stars because HTML scrapers break monthly with no notification. This actor uses Substack's archive API and has a 24-48 hour fix turnaround
Watchlist 30+ publications in one run, subdomains and custom domains mixed freely (thedailyloop or https://www.lennysnewsletter.com)
Filter paywall vs free without string-parsing, typed isPaywalled boolean and ISO 8601 timestamps everywhere
Sub-minute runtime, HTTP-only against Substack's archive API, no Playwright, no HTML parsing
Drop-in LLM context, agentMarkdown per record plus a per-run AGENT_BRIEFING.md digest of the top 5 most-engaged posts
Re-runs are safe to dedupe by ID, stable substack:post:<postId> keys
Schema doesn't break your pipeline, versioned and bumped on breaking change
AI agents can self-filter sparse rows via fieldCompletenessScore

Your feedback

Hit a bug or want a feature? Open an issue on the Issues tab rather than the reviews page, and we'll fix it fast (typically within 48 hours).

Other Skootle actors you might want to check

📰 Hacker News Watchlist, track new HN stories matching your keywords
🟠 Reddit Subreddit Monitor, daily diff of new subreddit posts
🍎 App Store Reviews, competitor app review monitoring

Support and contact

Issues: Issues tab. For other inquiries, contact via the Apify Store author profile.

Substack Posts Scraper 📚

scrapers-hub/substack-posts-scraper

Substack Posts scraper extracts publicly available newsletter posts, titles, authors, publication dates, tags, post URLs, and metadata 📰📊 Perfect for content research, trend analysis, competitive intelligence, and newsletter monitoring.

Scrapers Hub

Substack Scraper — Newsletter Posts & Content

fast_api/substack-scraper

Extract Substack newsletter posts, titles, subtitles, publication dates, engagement metrics, and optional full text. Useful for media monitoring, creator research, market intelligence, and AI/RAG datasets.

Fast API

Substack Newsletter Scraper

scrapers-hub/substack-newsletter-scraper

Substack Newsletter scraper extracts publicly available newsletter posts, titles, authors, publication dates, subscriber-facing content, and metadata 📰📊 Perfect for content research, trend analysis, competitive intelligence, and newsletter monitoring.

Scrapers Hub

Substack Scraper - Download Newsletter Content Fast

scrapers-hub/substack-scraper-download-newsletter-content-fast

Substack scraper extracts publicly available newsletter posts, titles, authors, publication dates, content, and metadata quickly 📰⚡ Perfect for content research, trend analysis, AI workflows, knowledge management, and newsletter monitoring.

Scrapers Hub

Substack Posts Scraper

fetch_cat/substack-posts-scraper

Collect public Substack newsletter posts, archives, and metadata for content research and media monitoring.

Hanna Nosova

Substack Posts Scraper - Newsletter Data Extractor

klondikeking/substack-posts-scraper

Extract posts, engagement metrics, and newsletter data from Substack publications. Perfect for content research.

Pierrick McD0nald

Substack Newsletter Content Scraper

scraper_guru/substack-scraper

Scrape Substack newsletter posts, authors, dates, likes, comments, restacks, and article text. Built for content research, competitor tracking, and AI-ready datasets.

LIAICHI MUSTAPHA

2.6

Substack Newsletter Scraper

cloud9_ai/substack-scraper

Scrape posts from any Substack newsletter publication. Returns post titles, URLs, publish dates, authors, and content previews via RSS feed.

cloud9

Substack Posts Scraper - Newsletter Data

benthepythondev/substack-posts-scraper

Scrape public Substack newsletter posts from one or many publications. Extract titles, authors, dates, full content, images, categories and post URLs.

Ben

Substack Posts Scraper - Low-cost💲🔥📰📚

delectable_incubator/substack-posts-scraper-low-cost

📰🔍 Extract Substack posts by profile or keyword with ease. Collect post titles, authors, publish dates, tags, reactions, word counts, excerpts, post URLs & newsletter metadata. Ideal for content research, newsletter monitoring, creator analysis, media intelligence & publishing trend analysis 📊🚀

Prime Scrape

Substack Posts Scraper for Newsletter Research

Fast answer: what this Actor is for

TL;DR

What does Substack Posts Monitor do?

Why scrape Substack?

Who needs this?

How to use Substack Posts Monitor

How much will scraping Substack cost?

Is it legal to scrape Substack?

Examples

Input parameters

Substack output format

substack_post

During the actor run

FAQ

How is this different from the existing Substack scrapers on the Store?

Can I bulk-scrape paywalled posts?

What about authors who block scraping?

Can I monitor only new posts?

Can I use this with Python, JavaScript, n8n, Make, or Zapier?

Why does this cost more than free Substack scrapers?

How fast is it?

What happens with custom-domain publications?

Why choose Substack Posts Monitor

Your feedback

Other Skootle actors you might want to check

Support and contact

You might also like

Substack Posts Scraper 📚

Substack Scraper — Newsletter Posts & Content

Substack Newsletter Scraper

Substack Scraper - Download Newsletter Content Fast

Substack Posts Scraper

Substack Posts Scraper - Newsletter Data Extractor

Substack Newsletter Content Scraper

Substack Newsletter Scraper

Substack Posts Scraper - Newsletter Data

Substack Posts Scraper - Low-cost💲🔥📰📚

`substack_post`