# Substack Scraper — Newsletters, Posts & Creator Leads (`scrapesage/substack-scraper`) Actor

Scrape Substack: search newsletters by keyword, browse category leaderboards, pull full publication profiles (subscribers, paid pricing, podcast), posts, authors and the recommendation network. Turn creators into leads with contact emails. Monitoring mode. No API key, no browser.

- **URL**: https://apify.com/scrapesage/substack-scraper.md
- **Developed by:** [Scrape Sage](https://apify.com/scrapesage) (community)
- **Categories:** Lead generation, Social media, News
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $4.00 / 1,000 publication scrapeds

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Substack Scraper — Newsletters, Posts & Creator Leads (Subscribers, Pricing, Emails)

Extract **complete Substack data** — search newsletters by keyword, browse **category leaderboards**, and pull the fields other scrapers miss: **free-subscriber counts, paid-subscriber tiers, real paid pricing (monthly / yearly / founding), podcast details, the recommendation network, and full author profiles**. Optionally turn every creator into a **ready-to-contact lead** by crawling their own website for **contact emails, phone, and socials**.

No login, no cookies, no browser — fast first-party JSON extraction with 99%+ reliability.

### Why this Substack scraper?

Most Substack scrapers return a thin slice — a title, a date, maybe a subscriber number. This actor reads Substack's own public API and ships the **richest dataset in the category**, across newsletters, posts and authors in one run:

| Data | Typical scrapers | This actor |
|---|---|---|
| Search by keyword + category leaderboards | partial | ✅ both |
| Free subscriber count | partial | ✅ |
| Paid-subscriber tier (e.g. "Thousands of paid subscribers") | ❌ | ✅ |
| Real paid pricing — monthly / yearly / founding + currency | ❌ | ✅ |
| Accepts sponsorships (ad-sales signal) | ❌ | ✅ |
| Podcast title / description / flags | ❌ | ✅ |
| Recommendation network (who recommends whom) | ❌ | ✅ opt-in |
| Posts — reactions, restacks, comments, word count | partial | ✅ opt-in |
| Full post content (HTML + plain text) | ❌ | ✅ opt-in |
| Author profiles — followers, bio, external links, all publications | ❌ | ✅ opt-in |
| Creator **contact emails** (from their website) | ❌ | ✅ opt-in |
| Lead score (0–100) per newsletter | ❌ | ✅ |
| **No start fee** | ❌ many charge per run | ✅ pay per result only |

### Use cases

- **Creator & newsletter lead generation** — Substack creators are active buyers and sellers: they want tools, sponsors, cross-promotion, and ghostwriters. Score them by audience (`freeSubscriberCount`, `paidSubscriberTier`) and reach them directly (`supportEmail`, `contactEmails`).
- **Sponsorship & ad-sales prospecting** — find paid newsletters that `acceptsSponsorships`, ranked by subscriber tier and niche, with contact data attached.
- **Market & competitor research** — track category leaderboards, paid pricing, posting cadence, and engagement (reactions, restacks, comments) across any topic.
- **Content & trend analysis** — pull posts with full content for summarization, RAG, sentiment, and topic modeling.
- **Influencer / partnership discovery** — map the recommendation network to find who the top newsletters endorse.

### How to use

1. [Sign up for Apify](https://console.apify.com/sign-up) — the free plan is enough to try this actor.
2. Open the **Substack Scraper**, enter search queries and/or categories (or paste Substack URLs), and click **Start**.
3. Watch results stream into the dataset table.
4. **Export** as JSON, CSV, Excel, XML, or RSS — or pull results programmatically via the [Apify API](https://docs.apify.com/api/v2).

### Input

```json
{
    "searchQueries": ["artificial intelligence"],
    "categories": ["Technology", "Business"],
    "maxPublications": 200,
    "includePosts": true,
    "maxPostsPerPublication": 20,
    "includeRecommendations": true,
    "includeAuthorProfiles": true,
    "enrichContactEmails": true,
    "onlyPaidPublications": false,
    "minFreeSubscribers": 1000
}
````

- **searchQueries** — keywords to search publications (each returns full newsletter profiles).
- **categories** — category leaderboards by name (`Technology`, `Business`, `Finance`, `Culture`, `U.S. Politics`, `Food & Drink`, `Sports`, …) or numeric id.
- **startUrls** — direct publication URLs (`https://newsletter.substack.com` or a custom domain), post URLs (`.../p/the-slug`), or author profiles (`https://substack.com/@handle`).
- **maxPublications** *(default 100)* — cap on unique publications from search + categories.
- **includePosts** / **maxPostsPerPublication** / **includePostContent** — add recent posts, and optionally their full HTML + plain text.
- **includeRecommendations** *(default false)* — add each newsletter's recommendation network as a `recommends` array.
- **includeAuthorProfiles** *(default false)* — emit one author record per unique creator (followers, bio, links, all publications).
- **enrichContactEmails** *(default false)* — crawl the publication's own website (home + about/contact, max 3 pages) for emails, phone, and extra socials. Substack never exposes emails — this is the only way to get them.
- **onlyPaidPublications** / **minFreeSubscribers** — filters.
- **monitorMode** *(default false)* — emit only publications/posts not seen in previous runs (see below).

### Output

One record per newsletter (`type: "publication"`), plus optional post records (`type: "post"`) and author records (`type: "author"`):

```json
{
    "type": "publication",
    "id": 89120,
    "name": "Astral Codex Ten",
    "subdomain": "astralcodexten",
    "url": "https://www.astralcodexten.com",
    "customDomain": "www.astralcodexten.com",
    "publicationType": "newsletter",
    "tagline": "P(A|B) = [P(A)*P(B|A)]/P(B), all the rest is commentary",
    "authorName": "Scott Alexander",
    "authorHandle": "astralcodexten",
    "authorBio": "Psychiatrist, blogger…",
    "freeSubscriberCount": 91000,
    "paidSubscriberTier": "Thousands of paid subscribers",
    "bestsellerTier": 1000,
    "isPaid": true,
    "currency": "USD",
    "monthlyPrice": 10,
    "yearlyPrice": 100,
    "foundingPrice": 300,
    "acceptsSponsorships": false,
    "hasPodcast": true,
    "supportEmail": "astralcodexten@substack.com",
    "website": "https://www.astralcodexten.com",
    "contactEmails": ["scott@slatestarcodex.com"],
    "contactSocials": { "twitter": "https://twitter.com/slatestarcodex" },
    "recommends": [
        { "name": "Slow Boring", "subdomain": "slowboring", "url": "https://www.slowboring.com" }
    ],
    "leadScore": 86,
    "category": "Technology",
    "searchQuery": "artificial intelligence",
    "scrapedAt": "2026-06-14T12:00:00.000Z"
}
```

### Monitoring mode

Turn on **monitorMode** to make the actor remember every publication and post it has already returned (in a named key-value store) and emit **only new ones** on the next run. Combine it with [Apify Schedules](https://docs.apify.com/platform/schedules) to:

- watch a category or keyword for **newly launched newsletters**,
- alert on **new posts** from a set of newsletters you track,
- keep a CRM topped up with **fresh creator leads**.

Monitoring mode is independent of the scheduler: Schedules decide *when* a run starts; monitoring decides *what counts as new*. Use a distinct `monitorStoreName` per tracked target to keep histories separate.

### Automate & schedule

Run this actor on autopilot and pull results into your own stack:

- **[Apify API](https://docs.apify.com/api/v2)** — start runs, fetch datasets, and manage schedules over REST.
- **[apify-client for JavaScript](https://docs.apify.com/api/client/js/)** and **[apify-client for Python](https://docs.apify.com/api/client/python/)** — official SDKs.
- **[Schedules](https://docs.apify.com/platform/schedules)** — run it hourly/daily/weekly to monitor new newsletters, posts, or leads.
- **[Webhooks](https://docs.apify.com/platform/integrations/webhooks)** — trigger downstream actions (CRM import, Slack alert, email sequence) the moment a run finishes.

```js
import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'MY_APIFY_TOKEN' });

const run = await client.actor('scrapesage/substack-scraper').call({
    searchQueries: ['fintech'],
    categories: ['Finance'],
    maxPublications: 200,
    enrichContactEmails: true,
    onlyPaidPublications: true,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(`Got ${items.length} newsletters & creator leads`);
```

### Integrate with any app

Connect the dataset to 5,000+ apps — no code required:

- **[Make](https://docs.apify.com/platform/integrations/make)** — multi-step automation scenarios.
- **[Zapier](https://docs.apify.com/platform/integrations/zapier)** — push new creator leads straight into your CRM.
- **[Slack](https://docs.apify.com/platform/integrations/slack)** — get notified when a monitored search finds new newsletters.
- **[Google Drive / Sheets](https://docs.apify.com/platform/integrations/drive)** — auto-export every run to a spreadsheet.
- **[Airbyte](https://docs.apify.com/platform/integrations/airbyte)** — pipe results into your data warehouse.
- **[GitHub](https://docs.apify.com/platform/integrations/github)** — trigger runs from commits or releases.

### Use with AI assistants (MCP)

The output is clean, LLM-ready JSON. You can call this actor from Claude, ChatGPT, or any agent framework through the **[Apify MCP server](https://docs.apify.com/platform/integrations/mcp)** — ask your assistant to "find the top AI newsletters on Substack and list their contact emails" and let it run this scraper for you.

### More scrapers from scrapesage

Build a complete **creator & event lead-gen stack**:

- **[Eventbrite Scraper](https://apify.com/scrapesage/eventbrite-scraper)** — events + organizer leads (prices, emails, socials).
- **[Sched Conference Scraper](https://apify.com/scrapesage/sched-conference-scraper)** — sessions, speakers & sponsors from Sched event sites.
- **[Whova Event Scraper](https://apify.com/scrapesage/whova-event-scraper)** — attendees, agendas, and sponsors from Whova event apps.
- **[Swapcard Exhibitor Scraper](https://apify.com/scrapesage/swapcard-exhibitor-scraper)** — exhibitor lists and booth data from Swapcard trade shows.
- **[Facebook Ad Library Scraper](https://apify.com/scrapesage/facebook-ad-library-scraper)** — competitor ad intelligence (Meta + Instagram).
- **[Google Ads Transparency Scraper](https://apify.com/scrapesage/google-ads-transparency-scraper)** — who's advertising what on Google.
- **[LinkedIn Jobs Scraper](https://apify.com/scrapesage/linkedin-jobs-scraper)** — job postings as hiring-intent signals.
- **[Bark Listing Scraper](https://apify.com/scrapesage/bark-listing-scraper)** — service-provider leads from Bark.
- **[Airbnb Scraper](https://apify.com/scrapesage/airbnb-scraper)** — listings, prices, and availability.

### Tips

- **Exhaust a niche**: combine `searchQueries` (keywords) with `categories` (leaderboards) to cover both long-tail and top newsletters; raise `maxPublications`.
- **Best leads**: set `onlyPaidPublications: true` + `minFreeSubscribers` + `enrichContactEmails: true` to get monetizing creators with real contact data and a high `leadScore`.
- **Cost control**: posts, recommendations, author profiles and email enrichment are all opt-in, so you only pay for what you turn on; email enrichment only runs for publications that actually have a website.
- **Monitoring**: combine `monitorMode` with [Schedules](https://docs.apify.com/platform/schedules) to track only new newsletters/posts.

### FAQ

**How do I scrape the top newsletters in a topic?** Put the category name in `categories` (e.g. `Technology`, `Finance`) to pull its leaderboard, and/or add keywords to `searchQueries`.

**Where do the emails come from?** Never from Substack (they don't publish creator emails). With `enrichContactEmails` on, the actor visits the newsletter's own public website and extracts publicly listed contact emails — the same thing a human visitor would see. Many newsletters also expose a `supportEmail` directly.

**Does it expose exact paid-subscriber counts?** Substack hides exact paid counts, but publishes a tier band (e.g. "Hundreds/Thousands of paid subscribers") which this actor returns as `paidSubscriberTier`, plus the exact `freeSubscriberCount` for most newsletters.

**Can I export to Google Sheets, CSV, or Excel?** Yes — one click in the dataset view, or automatically on every run via the [Google Drive integration](https://docs.apify.com/platform/integrations/drive).

**Is scraping Substack legal?** This actor collects publicly available data only. You are responsible for using the data in compliance with applicable laws (GDPR/CCPA for personal data) and Substack's terms.

**A field is null — why?** Some newsletters genuinely don't publish a price (free-only), a website, or a podcast. Fields are `null` only when the data doesn't exist, not because the scraper skipped them.

### Need help?

Open an issue on the actor's **Issues** tab, or visit the [Apify help center](https://help.apify.com/). Feature requests are welcome — this actor is actively maintained.

# Actor input Schema

## `searchQueries` (type: `array`):

Keywords to search Substack publications, e.g. <code>artificial intelligence</code>, <code>marketing</code>, <code>crypto</code>. Each query returns matching newsletters with full profiles. Combine with categories and start URLs.

## `categories` (type: `array`):

Browse top-ranked publications in these categories (the Substack leaderboard). Use names like <code>Technology</code>, <code>Business</code>, <code>Finance</code>, <code>Culture</code>, <code>U.S. Politics</code>, <code>Food & Drink</code>, <code>Sports</code>, or a numeric category id.

## `startUrls` (type: `array`):

Direct Substack URLs: a publication (<code>https://newsletter.substack.com</code> or a custom domain), a post (<code>.../p/the-slug</code>), or an author profile (<code>https://substack.com/@handle</code>). Mixed lists are fine.

## `maxPublications` (type: `integer`):

Cap the number of unique publications collected across all search queries and categories. Start URLs are always processed.

## `includePosts` (type: `boolean`):

For each publication, also emit its recent posts (title, date, audience, reactions, restacks, comments, word count, podcast info).

## `maxPostsPerPublication` (type: `integer`):

How many recent posts to return per publication when 'Include posts' is on.

## `includePostContent` (type: `boolean`):

Add the full post body (HTML and plain text) to each post record. Increases dataset size; leave off for metadata-only.

## `includeRecommendations` (type: `boolean`):

Add the publications each newsletter recommends (the Substack growth/recommendation graph) as a 'recommends' array on the publication record.

## `includeAuthorProfiles` (type: `boolean`):

Emit one author record per unique creator (bio, follower count, subscriber band, bestseller tier, external links, all their publications). Also used for author start URLs.

## `enrichContactEmails` (type: `boolean`):

Crawl the publication's own website (custom domain home + about/contact, max 3 pages) for contact emails, phone and extra social links. Substack does not expose emails — this is the only way to get them. Adds a lead score.

## `onlyPaidPublications` (type: `boolean`):

Keep only publications that offer a paid subscription (they have monetization budget and sponsorship intent).

## `minFreeSubscribers` (type: `integer`):

Keep only publications with at least this many free subscribers (Substack publishes a free-subscriber count for most newsletters). 0 = no filter.

## `monitorMode` (type: `boolean`):

Remember publications and posts already returned and emit ONLY ones not seen in previous runs — ideal for scheduled runs that track new newsletters or new posts. Works alongside Apify Schedules.

## `monitorStoreName` (type: `string`):

Named key-value store that holds the 'already seen' ids for monitoring mode. Use a different name per tracked target to keep histories separate.

## `maxConcurrency` (type: `integer`):

Maximum parallel requests. Substack's public endpoints are tolerant; 5 is a safe, fast default.

## `proxyConfiguration` (type: `object`):

Proxy settings. Substack serves clean JSON from datacenter IPs, so the default Apify proxy is plenty.

## Actor input object example

```json
{
  "searchQueries": [
    "artificial intelligence"
  ],
  "maxPublications": 100,
  "includePosts": false,
  "maxPostsPerPublication": 20,
  "includePostContent": false,
  "includeRecommendations": false,
  "includeAuthorProfiles": false,
  "enrichContactEmails": false,
  "onlyPaidPublications": false,
  "minFreeSubscribers": 0,
  "monitorMode": false,
  "monitorStoreName": "substack-monitor",
  "maxConcurrency": 5,
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}
```

# Actor output Schema

## `results` (type: `string`):

All scraped records in the default dataset. Publication rows carry the full profile, subscriber counts, paid pricing, lead score and contact fields; post rows carry engagement and optional content; author rows carry creator profiles and external links.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "searchQueries": [
        "artificial intelligence"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("scrapesage/substack-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "searchQueries": ["artificial intelligence"] }

# Run the Actor and wait for it to finish
run = client.actor("scrapesage/substack-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "searchQueries": [
    "artificial intelligence"
  ]
}' |
apify call scrapesage/substack-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=scrapesage/substack-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Substack Scraper — Newsletters, Posts & Creator Leads",
        "description": "Scrape Substack: search newsletters by keyword, browse category leaderboards, pull full publication profiles (subscribers, paid pricing, podcast), posts, authors and the recommendation network. Turn creators into leads with contact emails. Monitoring mode. No API key, no browser.",
        "version": "0.1",
        "x-build-id": "QIdwooYHoFn5xL7Nb"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/scrapesage~substack-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-scrapesage-substack-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/scrapesage~substack-scraper/runs": {
            "post": {
                "operationId": "runs-sync-scrapesage-substack-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/scrapesage~substack-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-scrapesage-substack-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "searchQueries": {
                        "title": "Search queries (keywords)",
                        "type": "array",
                        "description": "Keywords to search Substack publications, e.g. <code>artificial intelligence</code>, <code>marketing</code>, <code>crypto</code>. Each query returns matching newsletters with full profiles. Combine with categories and start URLs.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "categories": {
                        "title": "Category leaderboards",
                        "type": "array",
                        "description": "Browse top-ranked publications in these categories (the Substack leaderboard). Use names like <code>Technology</code>, <code>Business</code>, <code>Finance</code>, <code>Culture</code>, <code>U.S. Politics</code>, <code>Food &amp; Drink</code>, <code>Sports</code>, or a numeric category id.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "startUrls": {
                        "title": "Start URLs (publications, posts or authors)",
                        "type": "array",
                        "description": "Direct Substack URLs: a publication (<code>https://newsletter.substack.com</code> or a custom domain), a post (<code>.../p/the-slug</code>), or an author profile (<code>https://substack.com/@handle</code>). Mixed lists are fine.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxPublications": {
                        "title": "Max publications",
                        "minimum": 1,
                        "maximum": 5000,
                        "type": "integer",
                        "description": "Cap the number of unique publications collected across all search queries and categories. Start URLs are always processed.",
                        "default": 100
                    },
                    "includePosts": {
                        "title": "Include posts",
                        "type": "boolean",
                        "description": "For each publication, also emit its recent posts (title, date, audience, reactions, restacks, comments, word count, podcast info).",
                        "default": false
                    },
                    "maxPostsPerPublication": {
                        "title": "Max posts per publication",
                        "minimum": 1,
                        "maximum": 500,
                        "type": "integer",
                        "description": "How many recent posts to return per publication when 'Include posts' is on.",
                        "default": 20
                    },
                    "includePostContent": {
                        "title": "Include full post content",
                        "type": "boolean",
                        "description": "Add the full post body (HTML and plain text) to each post record. Increases dataset size; leave off for metadata-only.",
                        "default": false
                    },
                    "includeRecommendations": {
                        "title": "Include recommendation network",
                        "type": "boolean",
                        "description": "Add the publications each newsletter recommends (the Substack growth/recommendation graph) as a 'recommends' array on the publication record.",
                        "default": false
                    },
                    "includeAuthorProfiles": {
                        "title": "Include author profiles",
                        "type": "boolean",
                        "description": "Emit one author record per unique creator (bio, follower count, subscriber band, bestseller tier, external links, all their publications). Also used for author start URLs.",
                        "default": false
                    },
                    "enrichContactEmails": {
                        "title": "Enrich creator leads (contact emails)",
                        "type": "boolean",
                        "description": "Crawl the publication's own website (custom domain home + about/contact, max 3 pages) for contact emails, phone and extra social links. Substack does not expose emails — this is the only way to get them. Adds a lead score.",
                        "default": false
                    },
                    "onlyPaidPublications": {
                        "title": "Only paid publications",
                        "type": "boolean",
                        "description": "Keep only publications that offer a paid subscription (they have monetization budget and sponsorship intent).",
                        "default": false
                    },
                    "minFreeSubscribers": {
                        "title": "Minimum free subscribers",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Keep only publications with at least this many free subscribers (Substack publishes a free-subscriber count for most newsletters). 0 = no filter.",
                        "default": 0
                    },
                    "monitorMode": {
                        "title": "Monitoring mode — only new records",
                        "type": "boolean",
                        "description": "Remember publications and posts already returned and emit ONLY ones not seen in previous runs — ideal for scheduled runs that track new newsletters or new posts. Works alongside Apify Schedules.",
                        "default": false
                    },
                    "monitorStoreName": {
                        "title": "Monitor store name",
                        "type": "string",
                        "description": "Named key-value store that holds the 'already seen' ids for monitoring mode. Use a different name per tracked target to keep histories separate.",
                        "default": "substack-monitor"
                    },
                    "maxConcurrency": {
                        "title": "Max concurrency",
                        "minimum": 1,
                        "maximum": 20,
                        "type": "integer",
                        "description": "Maximum parallel requests. Substack's public endpoints are tolerant; 5 is a safe, fast default.",
                        "default": 5
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Proxy settings. Substack serves clean JSON from datacenter IPs, so the default Apify proxy is plenty.",
                        "default": {
                            "useApifyProxy": true
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
