# Substack Scraper — Newsletters, Posts, Authors & Subscribers (`logiover/substack-newsletter-scraper`) Actor

Discover Substack newsletters by category & leaderboard rank, then pull every post, author and publication. 30+ categories or direct subdomain. Per post: title, audience (free/paid), reactions, restacks. Per pub: subdomain, custom domain, author, subscription tiers. Public Substack API — no auth.

- **URL**: https://apify.com/logiover/substack-newsletter-scraper.md
- **Developed by:** [Logiover](https://apify.com/logiover) (community)
- **Categories:** Automation, Lead generation, Marketing
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $2.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Substack Scraper — Newsletter Discovery, Posts, Authors & Subscriber Data

Discover **Substack newsletters by category and leaderboard rank**, then pull **every post, every author and full publication metadata** in a single normalized dataset. Search across **30+ Substack categories** — Technology, Business, Finance, Crypto, News, Culture, Health, Politics, Science, Design and more — or jump straight to a known **subdomain or custom domain**.

Built on Substack's **public JSON API** — no authentication, no proxy, no scraping fight. Per post: title, subtitle, **audience (free / paid / founding)**, post date, reaction count, restacks, cover image, podcast info and canonical URL. Per publication: name, subdomain, **custom domain, author display name, hero description, language, subscription benefits and founded date**.

Perfect for **content intelligence**, **RAG / LLM training data**, **newsletter sponsorship outreach**, **competitive monitoring**, **author / creator lead generation**, and **podcast network discovery**.

---

### 🚀 What does this Substack scraper do?

Two complementary modes — combine them or use one in isolation:

| Mode | When to use | What it returns |
|------|-------------|-----------------|
| **Category Discovery** | Find every tech / finance / crypto newsletter on Substack ranked by leaderboard, top paid, or all-publications | Top-N publications per category + every post per publication |
| **Direct Newsletter URLs** | You already know the newsletter — pass its subdomain, custom domain, or just its name | All posts of that publication, with metadata |

Optional: also push one **publication-level record** per discovered newsletter (denormalized list of every newsletter alongside posts) for downstream relational joins.

---

### 💡 Use cases

- **Content intelligence platforms** — track every AI / startup / crypto newsletter on Substack with daily refresh
- **Newsletter sponsorship outreach** — pull author names, custom domains, free-vs-paid mix, subscription tier descriptions for cold outreach lists
- **RAG / LLM training data** — every public Substack post is high-quality long-form content with author + date + topic metadata, ready for vector indexing
- **Substack-to-Beehiiv migration tools** — bulk-export an author's archive
- **Competitive monitoring** — track when a competitor's newsletter publishes, what audience tier it targets, and how reader reactions trend
- **Investor / VC research** — every fintech / crypto / SaaS founder publishes here now; find them at scale
- **Podcast network discovery** — Substack hosts thousands of independent podcasts; the actor exposes `podcastFeedUrl` for each publication
- **Newsletter ranking dashboards** — leaderboard mode returns subscriber-weighted rankings per category

---

### ⚙️ Input configuration

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `categorySlugs` | `string[]` | `[]` | Substack category slugs (see list below). Each is paginated and enumerated to `maxPublicationsPerCategory`. |
| `categoryRanking` | `string` | `"leaderboard"` | `leaderboard` (top by subs + engagement), `paid` (top paid), `all` (default Substack sort). |
| `newsletterUrls` | `string[]` | `[]` | Specific newsletters to scrape. Accepts `subdomain`, `subdomain.substack.com`, or custom domain. |
| `maxPublicationsPerCategory` | `integer` | `25` | Hard cap per category. Substack returns 25 per page, the actor auto-paginates. |
| `maxPostsPerPublication` | `integer` | `50` | Hard cap per publication via the archive API. `0` = skip posts (publication-only mode). |
| `audienceFilter` | `string` | `"all"` | `all` / `free` (only posts open to everyone) / `paid` (only paid-subscriber posts). |
| `minPostDate` | `string` | `null` | Drop posts published before this `YYYY-MM-DD`. |
| `maxPostDate` | `string` | `null` | Drop posts published after this `YYYY-MM-DD`. |
| `keywordFilter` | `string[]` | `[]` | Client-side title/subtitle substring filter (case-insensitive). E.g. `["ai","gpt"]`. |
| `includePublicationMetadata` | `boolean` | `true` | Enrich each post with parent-publication fields. |
| `alsoPushPublicationRecord` | `boolean` | `false` | Push one extra record per publication (`recordType: "publication"`) alongside post records. |
| `language` | `string` | `""` | ISO 639-1 publication-language filter (e.g. `en`). |

#### Supported category slugs

`technology` · `business` · `finance` · `crypto` · `news` · `us-politics` · `world-politics` · `health-politics` · `culture` · `science` · `health` · `design` · `travel` · `parenting` · `literature` · `fiction` · `philosophy` · `history` · `climate` · `art` · `music` · `sports` · `food` · `film-and-tv` · `comics` · `humor` · `fashionandbeauty` · `education` · `faith` · `international` · `home-garden` · `podcast`

The full live category list with subcategories is fetched from Substack at runtime, so additions automatically flow through.

---

### 📦 Output fields

Records have `recordType: "post"` or `recordType: "publication"`.

#### Per-post fields

| Field | Description | Example |
|-------|-------------|---------|
| `recordType` | `"post"` | `"post"` |
| `postId` | Substack post ID | `195672025` |
| `postSlug` | URL slug | `"why-saas-freemium-playbooks-dont"` |
| `postTitle` | Post title | `"Why SaaS freemium playbooks don't work in AI..."` |
| `postSubtitle` | Subtitle | `"How to build an AI monetization strategy..."` |
| `postType` | `newsletter`, `podcast`, `thread`, etc. | `"newsletter"` |
| `audience` | `everyone` / `only_paid` / `founding` | `"only_paid"` |
| `postDate` | Publication timestamp (ISO) | `"2026-05-05T13:03:32.007Z"` |
| `canonicalUrl` | Full post URL | `"https://www.lennysnewsletter.com/p/..."` |
| `coverImage` | Hero image URL | `"https://substackcdn.com/.../cover.png"` |
| `reactions` | Total reactions | `313` |
| `restacks` | Number of restacks | `10` |
| `wordCount` | Word count (if returned) | `2500` |
| `podcastDuration` | Episode duration (seconds) | `1820` |
| `podcastUrl` | Podcast audio URL | `"https://.../episode.mp3"` |
| `videoUploadId` | Video upload ID (if any) | `"vid_abc"` |
| `sectionName` | Substack section | `"Premium"` |

#### Per-publication fields (added when `includePublicationMetadata: true`)

| Field | Description | Example |
|-------|-------------|---------|
| `publicationId` | Substack publication ID | `10845` |
| `publicationName` | Newsletter name | `"Lenny's Newsletter"` |
| `subdomain` | `*.substack.com` subdomain | `"lennysnewsletter"` |
| `customDomain` | Custom domain (if any) | `"www.lennysnewsletter.com"` |
| `publicationUrl` | Primary base URL | `"https://www.lennysnewsletter.com"` |
| `logoUrl` | Logo image | `"https://substackcdn.com/.../logo.png"` |
| `coverPhotoUrl` | Cover photo | `"https://..."` |
| `publicationDescription` | Hero/about text | `"The #1 product / growth newsletter..."` |
| `language` | ISO 639-1 | `"en"` |
| `authorId` | Author user ID | `22329494` |
| `authorName` | Display name / copyright owner | `"Lenny Rachitsky"` |
| `category` | Category slug used to discover | `"technology"` |
| `publicationCreatedAt` | Founded timestamp | `"2020-09-08T..."` |
| `freeSubscriptionBenefits` | Bullets shown to free subscribers | `["Weekly free post"]` |
| `paidSubscriptionBenefits` | Bullets for paid tier | `["Full archive", "AMAs", ...]` |
| `foundingSubscriptionBenefits` | Top-tier benefits | `["Direct access to Lenny"]` |
| `communityEnabled` | Comments / Notes enabled | `true` |
| `podcastFeedUrl` | Substack RSS feed (audio) | `"https://lennysnewsletter.substack.com/feed/podcast"` |

---

### 🧪 Example inputs

#### 1. Top 25 tech newsletters and their latest 50 posts

```json
{
  "categorySlugs": ["technology"],
  "categoryRanking": "leaderboard",
  "maxPublicationsPerCategory": 25,
  "maxPostsPerPublication": 50
}
````

#### 2. Free posts from top AI/crypto newsletters in the last month

```json
{
  "categorySlugs": ["technology", "crypto", "finance"],
  "categoryRanking": "leaderboard",
  "maxPublicationsPerCategory": 50,
  "maxPostsPerPublication": 20,
  "audienceFilter": "free",
  "minPostDate": "2026-04-15",
  "keywordFilter": ["ai", "gpt", "llm", "crypto"]
}
```

#### 3. One specific newsletter's full archive

```json
{
  "newsletterUrls": ["lennysnewsletter"],
  "maxPostsPerPublication": 1000
}
```

#### 4. Build a newsletter directory (publication records only)

```json
{
  "categorySlugs": ["technology", "business", "finance", "crypto", "news", "science"],
  "categoryRanking": "leaderboard",
  "maxPublicationsPerCategory": 100,
  "maxPostsPerPublication": 0,
  "alsoPushPublicationRecord": true
}
```

#### 5. Mix categories + direct URLs in one run

```json
{
  "categorySlugs": ["technology"],
  "categoryRanking": "paid",
  "maxPublicationsPerCategory": 30,
  "newsletterUrls": ["www.semianalysis.com", "stratechery.substack.com"],
  "maxPostsPerPublication": 30,
  "audienceFilter": "all"
}
```

#### 6. English-only top-paid tech newsletters with podcast feeds

```json
{
  "categorySlugs": ["technology"],
  "categoryRanking": "paid",
  "language": "en",
  "alsoPushPublicationRecord": true,
  "maxPublicationsPerCategory": 100,
  "maxPostsPerPublication": 5
}
```

***

### 🧠 How it works

1. **Categories** → `GET https://substack.com/api/v1/categories` returns every active category and subcategory with numeric IDs.
2. **Discovery** → `GET https://substack.com/api/v1/category/public/{id}/{leaderboard|paid|all}?page=N` paginates 25 publications per page.
3. **Posts** → `GET https://{subdomain}.substack.com/api/v1/archive?sort=new&offset=O&limit=12` (or custom-domain equivalent) paginates 12 posts per page in reverse-chronological order.
4. **Direct URLs** → if you pass a custom domain, the actor probes the `/api/v1/archive` endpoint on multiple host candidates to find a working base URL.
5. **Deduplication** → publications are keyed by `publicationId`; cross-category enumeration never double-fetches the same newsletter.

No authentication. No proxy. Substack publishes all of this data on its public web.

***

### 🛑 Limits & notes

- **Word count, full post body, and subscriber counts** are not exposed in the archive endpoint. For the full post HTML/body, the per-post detail endpoint `https://{base}/api/v1/posts/{slug}` can be added — open an issue if you need it.
- **Subscriber counts are private** — Substack only shows them to publication owners. The actor returns proxies (reactions, restacks, leaderboard rank).
- **Paid post bodies** are paywalled — you only get the public preview unless the actor is run with a paid-subscriber cookie (out of scope here).
- **Rate limits** — Substack does not publish explicit limits but throttles aggressive callers. The actor uses exponential backoff and realistic browser headers; in practice 25k+ posts per run runs cleanly.
- **Non-Substack platforms** (Beehiiv, Ghost, ConvertKit, Stratechery's custom platform) will be skipped with a warning when passed in `newsletterUrls`.

***

### 💰 Pricing

Monetized via **pay-per-event** on Apify — pay per post or publication record saved. Substack's public API is free.

***

### ❓ FAQ

**Can I get subscriber counts?**
No — Substack treats subscriber counts as private. The leaderboard rank, reactions, and restack count are the public proxy metrics.

**Can I get the full post body?**
Free posts only. The actor currently returns metadata + canonical URL; for the rendered HTML body, request the per-post detail endpoint as a feature addition.

**Does this work with Beehiiv / Ghost?**
No — Substack-only. Beehiiv has its own public API (separate actor).

**How is this different from existing Substack actors?**
Most existing actors require a list of URLs upfront. This one does **discovery first** (by category + leaderboard rank), which is the hard part for outreach / intelligence use cases.

**Can I export to CSV / Excel?**
Yes — every Apify dataset can be exported in CSV, Excel, XML, JSONL or RSS straight from the run page.

**Will Substack block this?**
The endpoints used are the same ones the Substack website itself calls. No authentication is required and no rate limit has been hit at typical use. Use respectfully.

***

### 🔗 Related actors

- `logiover/apple-podcasts-episode-scraper` — feed the `podcastFeedUrl` from each publication into the podcast scraper for full episode lists
- `logiover/website-contact-scraper` — enrich each publication's `customDomain` with author contact emails for sponsorship outreach
- `logiover/google-news-scraper` — track press mentions of the newsletters you scraped
- `logiover/sitemap-to-url-crawler` — crawl the custom domain of each publication for landing pages and partnerships

***

### 🆘 Support

Need a specific Substack-related feature (full post body, comments, subscriber recommendations graph)? Open an issue on the actor's Apify page.

# Actor input Schema

## `categorySlugs` (type: `array`):

Substack categories to enumerate. Slugs: 'technology', 'business', 'finance', 'crypto', 'news', 'us-politics', 'world-politics', 'culture', 'science', 'health', 'design', 'travel', 'parenting', 'literature', 'fiction', 'philosophy', 'history', 'climate', 'art', 'music', 'sports', 'food', 'film-and-tv', 'comics', 'humor', 'fashionandbeauty', 'education', 'faith', 'international', 'home-garden'. Leave empty if you supply newsletter URLs directly.

## `categoryRanking` (type: `string`):

How to sort publications within each category. 'leaderboard' = top newsletters (most subscribers + engagement), 'paid' = top paid newsletters, 'all' = all publications sorted by Substack's default ranking.

## `newsletterUrls` (type: `array`):

Specific Substack publications to scrape, regardless of category. Accepts subdomains ('lennysnewsletter.substack.com'), bare names ('lennysnewsletter'), or custom domains ('www.lennysnewsletter.com').

## `maxPublicationsPerCategory` (type: `integer`):

Hard cap on publications enumerated per category. Substack returns 25 per page, the actor paginates automatically up to this cap.

## `maxPostsPerPublication` (type: `integer`):

Hard cap on posts pulled per publication via the archive API (paginates 12 per page). Set to 0 to skip post fetching entirely (publication-only mode).

## `audienceFilter` (type: `string`):

Filter posts by audience. 'all' = both free and paid, 'free' = free-only posts (the type any subscriber can read), 'paid' = paid-only posts (premium).

## `minPostDate` (type: `string`):

Filter posts to those published on or after this date (YYYY-MM-DD). Leave empty for no lower bound.

## `maxPostDate` (type: `string`):

Filter posts to those published on or before this date (YYYY-MM-DD). Leave empty for no upper bound.

## `keywordFilter` (type: `array`):

Only save posts whose title or subtitle contains any of these terms (case-insensitive substring match, applied client-side after fetch). Examples: \['ai', 'gpt'], \['layoffs'], \['startup']. Leave empty for no filter.

## `includePublicationMetadata` (type: `boolean`):

When enabled, every post record is enriched with parent-publication fields (subdomain, custom domain, author name, hero description, category, language, free / paid subscription benefits, founded date). Disable for smaller per-row records.

## `alsoPushPublicationRecord` (type: `boolean`):

When enabled, the actor also pushes one record per discovered publication (with `recordType: 'publication'`) in addition to per-post records. Useful when you want a denormalized list of newsletters in the same dataset.

## `language` (type: `string`):

Restrict to publications in a specific language (ISO 639-1). 'en' for English-only. Leave empty for all languages.

## Actor input object example

```json
{
  "categorySlugs": [
    "technology"
  ],
  "categoryRanking": "leaderboard",
  "newsletterUrls": [],
  "maxPublicationsPerCategory": 25,
  "maxPostsPerPublication": 50,
  "audienceFilter": "all",
  "minPostDate": null,
  "maxPostDate": null,
  "keywordFilter": [],
  "includePublicationMetadata": true,
  "alsoPushPublicationRecord": false,
  "language": ""
}
```

# Actor output Schema

## `recordType` (type: `string`):

post | publication

## `publicationName` (type: `string`):

Newsletter name

## `subdomain` (type: `string`):

Substack subdomain

## `customDomain` (type: `string`):

Custom domain

## `authorName` (type: `string`):

Author display name

## `postTitle` (type: `string`):

Post title

## `postSubtitle` (type: `string`):

Post subtitle

## `audience` (type: `string`):

everyone/only\_paid/founding

## `postDate` (type: `string`):

Publication date

## `reactions` (type: `string`):

Reaction count

## `restacks` (type: `string`):

Restack count

## `canonicalUrl` (type: `string`):

Post canonical URL

## `category` (type: `string`):

Publication category

## `scrapedAt` (type: `string`):

Scrape timestamp

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "categorySlugs": [
        "technology"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("logiover/substack-newsletter-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "categorySlugs": ["technology"] }

# Run the Actor and wait for it to finish
run = client.actor("logiover/substack-newsletter-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "categorySlugs": [
    "technology"
  ]
}' |
apify call logiover/substack-newsletter-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=logiover/substack-newsletter-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Substack Scraper — Newsletters, Posts, Authors & Subscribers",
        "description": "Discover Substack newsletters by category & leaderboard rank, then pull every post, author and publication. 30+ categories or direct subdomain. Per post: title, audience (free/paid), reactions, restacks. Per pub: subdomain, custom domain, author, subscription tiers. Public Substack API — no auth.",
        "version": "1.0",
        "x-build-id": "5FycbOg5Y4R6yJ9RG"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/logiover~substack-newsletter-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-logiover-substack-newsletter-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/logiover~substack-newsletter-scraper/runs": {
            "post": {
                "operationId": "runs-sync-logiover-substack-newsletter-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/logiover~substack-newsletter-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-logiover-substack-newsletter-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "categorySlugs": {
                        "title": "Category Slugs to Discover",
                        "type": "array",
                        "description": "Substack categories to enumerate. Slugs: 'technology', 'business', 'finance', 'crypto', 'news', 'us-politics', 'world-politics', 'culture', 'science', 'health', 'design', 'travel', 'parenting', 'literature', 'fiction', 'philosophy', 'history', 'climate', 'art', 'music', 'sports', 'food', 'film-and-tv', 'comics', 'humor', 'fashionandbeauty', 'education', 'faith', 'international', 'home-garden'. Leave empty if you supply newsletter URLs directly.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "categoryRanking": {
                        "title": "Category Ranking Mode",
                        "enum": [
                            "leaderboard",
                            "paid",
                            "all"
                        ],
                        "type": "string",
                        "description": "How to sort publications within each category. 'leaderboard' = top newsletters (most subscribers + engagement), 'paid' = top paid newsletters, 'all' = all publications sorted by Substack's default ranking.",
                        "default": "leaderboard"
                    },
                    "newsletterUrls": {
                        "title": "Direct Newsletter URLs",
                        "type": "array",
                        "description": "Specific Substack publications to scrape, regardless of category. Accepts subdomains ('lennysnewsletter.substack.com'), bare names ('lennysnewsletter'), or custom domains ('www.lennysnewsletter.com').",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxPublicationsPerCategory": {
                        "title": "Max Publications Per Category",
                        "minimum": 1,
                        "maximum": 1000,
                        "type": "integer",
                        "description": "Hard cap on publications enumerated per category. Substack returns 25 per page, the actor paginates automatically up to this cap.",
                        "default": 25
                    },
                    "maxPostsPerPublication": {
                        "title": "Max Posts Per Publication",
                        "minimum": 0,
                        "maximum": 10000,
                        "type": "integer",
                        "description": "Hard cap on posts pulled per publication via the archive API (paginates 12 per page). Set to 0 to skip post fetching entirely (publication-only mode).",
                        "default": 50
                    },
                    "audienceFilter": {
                        "title": "Audience Filter",
                        "enum": [
                            "all",
                            "free",
                            "paid"
                        ],
                        "type": "string",
                        "description": "Filter posts by audience. 'all' = both free and paid, 'free' = free-only posts (the type any subscriber can read), 'paid' = paid-only posts (premium).",
                        "default": "all"
                    },
                    "minPostDate": {
                        "title": "Minimum Post Date (ISO)",
                        "type": "string",
                        "description": "Filter posts to those published on or after this date (YYYY-MM-DD). Leave empty for no lower bound.",
                        "default": null
                    },
                    "maxPostDate": {
                        "title": "Maximum Post Date (ISO)",
                        "type": "string",
                        "description": "Filter posts to those published on or before this date (YYYY-MM-DD). Leave empty for no upper bound.",
                        "default": null
                    },
                    "keywordFilter": {
                        "title": "Keyword Filter (post title / subtitle)",
                        "type": "array",
                        "description": "Only save posts whose title or subtitle contains any of these terms (case-insensitive substring match, applied client-side after fetch). Examples: ['ai', 'gpt'], ['layoffs'], ['startup']. Leave empty for no filter.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "includePublicationMetadata": {
                        "title": "Include Publication Metadata in Each Record",
                        "type": "boolean",
                        "description": "When enabled, every post record is enriched with parent-publication fields (subdomain, custom domain, author name, hero description, category, language, free / paid subscription benefits, founded date). Disable for smaller per-row records.",
                        "default": true
                    },
                    "alsoPushPublicationRecord": {
                        "title": "Also Push Publication-Level Records",
                        "type": "boolean",
                        "description": "When enabled, the actor also pushes one record per discovered publication (with `recordType: 'publication'`) in addition to per-post records. Useful when you want a denormalized list of newsletters in the same dataset.",
                        "default": false
                    },
                    "language": {
                        "title": "Publication Language Filter",
                        "type": "string",
                        "description": "Restrict to publications in a specific language (ISO 639-1). 'en' for English-only. Leave empty for all languages.",
                        "default": ""
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
