Pricing

from $3.00 / 1,000 results

Substack Scraper

Scrape Substack publications via the public RSS feed of any newsletter. Extract post title, URL, author, publication date, body HTML, categories, and enclosures. HTTP-only with TLS impersonation (no auth, no proxy).

Pricing

from $3.00 / 1,000 results

Rating

0.0

(0)

Developer

Crawler Bros

Actor stats

Bookmarked

Total users

Monthly active users

2 months ago

Last modified

What this actor does

Accepts publication URLs in any form: full URL, custom domain, *.substack.com, or bare slug
Auto-rewrites to <publication>/feed
Parses RSS feed → extracts title / link / pubDate / dc:creator / content:encoded / categories / enclosure
Filters: category, published-after, keyword in title/summary
Optional body HTML inclusion (default on)
Approximate wordCount and readingTimeMinutes
Empty fields are omitted

Output per post

title, url, guid
author — from <dc:creator>
publishedAt — ISO 8601 UTC (parsed from RFC 822 pubDate)
publishedAtRaw — original RFC 822 string
summary — plain-text version of <description> (capped at 500 chars)
bodyHtml — full HTML body from <content:encoded> (when includeBody=true)
wordCount, readingTimeMinutes
categories[]
coverImage — from <enclosure> URL
publication, publicationUrl
recordType: "post", scrapedAt

Input

Field	Type	Default	Description
`publications`	array	`["platformer.news"]`	List of publication URLs / domains / slugs (required)
`categoryAnyOf`	array	`[]`	Match at least one RSS `<category>` tag
`publishedAfter`	string	–	YYYY-MM-DD
`containsKeyword`	string	–	Title/summary contains substring
`includeBody`	bool	`true`	Include full body HTML
`maxItems`	int	`50`	Hard cap (1–1000)

Example: scrape Platformer + Noahpinion

{
  "publications": ["platformer.news", "noahpinion.substack.com"],
  "publishedAfter": "2024-01-01",
  "maxItems": 100
}

Example: filter by keyword

{
  "publications": ["platformer.news"],
  "containsKeyword": "antitrust",
  "includeBody": true
}

Example: bare slugs (auto-resolved to .substack.com)

{
  "publications": ["noahpinion", "thedailyupside"]
}

Use cases

Newsletter intel — track competitor publications, harvest content
Market research — newsletters in your domain (analyst notes, sector reports)
RSS aggregation — consolidate multiple Substacks into a single feed
Content analysis — bulk-export newsletter posts for NLP / topic modeling
Backup — archive your own / a friend's Substack posts

FAQ

Do I need a Substack account? No. The actor only reads public RSS feeds.

Why does it use TLS impersonation? Substack's edge sometimes 403s requests with default Python TLS fingerprint. curl_cffi with chrome131 profile sends a real Chrome handshake, which Substack accepts.

What's the post URL format? https://<publication>/p/<slug>. The actor preserves whatever the RSS feed returns.

Are paid-only posts included? Substack's public RSS includes free posts and the public previews of paid posts. Full paid post content is not accessible without a subscription.

How fresh is the data? Real-time. RSS feeds update within minutes of post publish.

Can I scrape multiple publications in one run? Yes — pass multiple entries in publications. The actor walks each feed sequentially and dedupes by URL.

What if a publication's RSS is blocked / rate-limited? The actor retries with exponential backoff on 403/429/5xx. After 3 retries it skips to the next publication and logs a warning.

Custom-domain Substacks? Yes — pass the custom domain (e.g. platformer.news, stratechery.com). The actor appends /feed regardless of subdomain shape.

Substack Posts — Public Feed by Newsletter Slug

v0iddo/substack-newsletter-posts

Pull Substack newsletter posts via the public {slug}.substack.com/feed RSS endpoint. One row per post with title, link, author, pubDate, summary, category. No auth required.

vøiddo

Substack Newsletter Scraper

cloud9_ai/substack-scraper

Scrape posts from any Substack newsletter publication. Returns post titles, URLs, publish dates, authors, and content previews via RSS feed.

cloud9

Substack Newsletter Scraper

red.cars/substack-newsletter-scraper

Extract newsletter content, subscriber data, and author insights from any Substack publication - no API key required!

AutomateLab

1.0

Substack Scraper: Newsletter Posts, Archives & Subscribers

perconey/substack-scraper

Scrape any Substack publication: full post archive, single post detail with body, comment counts, reactions, paid/free audience, podcast metadata. No auth, no proxies, no cookies. Uses Substack official JSON API. Pay only per result.

Perconey

Substack Scraper

noximilian/substack-scraper

Scrape Substack newsletters — fetch post archives, individual posts, comments, recommendations, and publication metadata. Search Substack for publications and content. No auth required for public content.

Noximilian

Substack Posts Scraper - Newsletter Data

benthepythondev/substack-posts-scraper

Scrape public Substack newsletter posts from one or many publications. Extract titles, authors, dates, full content, images, categories and post URLs.

Ben

Substack Newsletter Scraper

prince.sh/substack-scraper

Scrape Substack newsletter archives. Get post titles, body text, authors, and publish dates for any Substack publication. Perfect for content aggregation, news monitoring, writer research, and AI training datasets.

Prince Jain

Substack Publication Scraper

parseforge/substack-publication-scraper

Pull every public post from any Substack publication with title, subtitle, body preview, author, publish date, podcast URL, audience type, comment count, and reactions. Filter by post type and date range. Export to JSON, CSV, or Excel for newsletter research and competitive intelligence.

ParseForge

Substack Newsletter Scraper

opalescent_quintet/substack-newsletter-scraper

Substack-Newsletter-Scraper Extract complete newsletter archives from any Substack publication with advanced filtering, multiple export formats, and engagement analytics. ## Features - Scrape entire newsletter archives from Substack - Extract full metadata: titles, content, author details etc.