Substack Post Content Fetcher avatar

Substack Post Content Fetcher

Pricing

from $3.50 / 1,000 post fetcheds

Go to Apify Store
Substack Post Content Fetcher

Substack Post Content Fetcher

Fetch the full HTML content of any public Substack post by URL. Body text, title, subtitle, tags, engagement stats, and author details.

Pricing

from $3.50 / 1,000 post fetcheds

Rating

0.0

(0)

Developer

Andrew

Andrew

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Share

Fetch the full HTML body of any public Substack post by URL — complete article content, title, subtitle, tags, engagement stats, and author details.

What you get

  • Post identifiers: postId, postUrl
  • Headline fields: title, subtitle, description
  • Full body content: bodyHtml — complete HTML of the post body, ready for parsing, indexing, or rendering
  • Publishing data: publishedAt, updatedAt, type, language, paywall status, word count
  • Engagement metrics: likes count, comments count, restacks count
  • Tags: array of tag names assigned to the post
  • Visuals: cover image URL
  • Author details: author name and handle
  • Run metadata: scrapedAt timestamp on every record

For free and public posts, bodyHtml is the full article. For paywalled posts, bodyHtml returns the public preview only.

Use cases

  • Newsletter content analysis — read and analyze full articles at scale
  • Text mining and NLP research — extract entities, topics, and sentiment from real-world long-form writing
  • Training dataset construction — build clean corpora of newsletter content for LLM fine-tuning or evaluation
  • Content archiving — preserve full post bodies in your own storage
  • SEO and competitive research — see exactly what your competitors are publishing, word for word
  • Programmatic reading and summarization — feed full posts into a summarizer, translator, or RAG pipeline

How to use

  1. Paste one or more Substack post URLs into Post URLs (each URL must include /p/{slug}; works with *.substack.com subdomains and custom domains)
  2. Run the actor — results appear in the Dataset tab
  3. Export to JSON, CSV, or Google Sheets, or feed bodyHtml into your downstream pipeline

Pair this actor with the Substack Post Scraper to first build a list of post URLs from a publication's archive, then fetch the full content for each one.

Output format

Each dataset record:

{
"postId": 182510930,
"postUrl": "https://www.astralcodexten.com/p/three-model-organisms-for-taste",
"title": "Three Model Organisms For Taste",
"subtitle": "...",
"description": "...",
"bodyHtml": "<p>Full post HTML content...</p>",
"publishedAt": "2026-05-08T08:50:41.734Z",
"updatedAt": null,
"type": "newsletter",
"isPaywalled": false,
"likesCount": 175,
"commentsCount": 324,
"restacksCount": 5,
"wordCount": 1786,
"coverImageUrl": "https://substackcdn.com/...",
"language": null,
"tags": ["science", "biology"],
"authorName": "Scott Alexander",
"authorHandle": "astralcodexten",
"scrapedAt": "2026-05-09T07:08:37.000Z"
}