# Substack Sponsor Intelligence (`dramatic_jonquil/substack-sponsor-intelligence`) Actor

Analyze Substack newsletters to extract sponsor data, detect monetization patterns, and generate structured intelligence using pattern matching and AI.

- **URL**: https://apify.com/dramatic\_jonquil/substack-sponsor-intelligence.md
- **Developed by:** [Alex Mercer](https://apify.com/dramatic_jonquil) (community)
- **Categories:** Automation, AI, Developer tools
- **Stats:** 1 total users, 0 monthly users, 0.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $70.00 / 1,000 newsletter analyzeds

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Substack Sponsor Intelligence

Extract sponsor and advertising data from Substack newsletters. Find out who is sponsoring which newsletters, how often, and in what format. The only tool that turns newsletter archives into structured sponsor intelligence.

### What This Actor Does

This Actor analyzes Substack newsletter archives and identifies sponsored content, paid partnerships, and advertising placements. It outputs structured data about each sponsor detection including the brand name, URL, placement type, and confidence level.

Think of it as competitive intelligence for the newsletter advertising market. Brands spend thousands per newsletter placement, but until now there's been no way to systematically track who's advertising where.

### Who This Is For

- **Media buyers and agencies** placing newsletter ads who need to see the competitive landscape
- **Newsletter operators** benchmarking their sponsorship rates against comparable publications
- **Brands** monitoring where competitors are advertising
- **Market researchers** tracking the newsletter advertising economy
- **Content strategists** analyzing sponsorship trends across niches

### How It Works

1. You provide one or more Substack newsletter URLs
2. The Actor fetches the newsletter's post archive via Substack's public API
3. Each post is analyzed for sponsor mentions using pattern matching, AI classification, or both
4. Structured sponsor data is output to the Apify dataset, ready for export

The Actor uses a multi-layer content resolution system. It tries the archive API content first, then falls back to fetching the actual post page when needed, so you get maximum detection coverage.

### Detection Modes

#### Pattern Match (default, no extra cost)

Uses regex-based detection to find explicit sponsor language like "brought to you by," "today's sponsor," "this issue is sponsored by," and similar phrases. Fast, cheap, and catches the majority of traditional newsletter sponsorships.

Patterns are classified into strong (high confidence, like "sponsored by") and weak (requires supporting evidence, like "in partnership with") to minimize false positives.

#### AI Enhanced (requires API key)

Runs each post through an LLM for deeper analysis. Catches native sponsorships, subtle paid placements, and sponsored deep-dives that don't use standard sponsor language. Pattern matching runs first as a baseline, then AI results are merged and deduplicated.

To use AI mode, set the `OPENAI_API_KEY` environment variable in your Actor settings. Supports any OpenAI-compatible API endpoint (OpenAI, OpenRouter, etc.).

### Input Example

```json
{
  "newsletterUrls": [
    "https://www.lennysnewsletter.com",
    "https://www.notboring.co",
    "https://thehustle.co"
  ],
  "maxPostsPerNewsletter": 50,
  "detectionMode": "pattern_match",
  "enableCanonicalFallback": true
}
````

### Output Example

Each sponsor detection produces a row like this:

```json
{
  "record_type": "sponsor_detection",
  "newsletter_name": "Not Boring",
  "newsletter_url": "https://www.notboring.co",
  "newsletter_domain": "www.notboring.co",
  "post_title": "The Future of Finance",
  "post_url": "https://www.notboring.co/p/the-future-of-finance",
  "post_date": "2026-03-10T00:00:00.000Z",
  "post_audience": "free",
  "sponsor_name": "Ramp",
  "sponsor_url": "https://ramp.com",
  "sponsor_domain": "ramp.com",
  "placement_type": "top_banner",
  "detection_method": "pattern_match",
  "detection_confidence": "high",
  "content_source": "canonical_fetch",
  "sponsor_text_snippet": "Today's Not Boring is brought to you by Ramp..."
}
```

The Actor also generates a summary with aggregate stats, sponsor frequency tables, and newsletter sponsor density metrics. The summary is stored in the key-value store under the key `SUMMARY` and optionally included in the dataset.

### Input Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| newsletterUrls | string\[] | required | Substack newsletter URLs to analyze |
| maxPostsPerNewsletter | integer | 50 | Posts to analyze per newsletter (0 = all) |
| sinceDate | string | null | Only analyze posts after this date (YYYY-MM-DD) |
| detectionMode | string | pattern\_match | `pattern_match` or `ai_enhanced` |
| enableCanonicalFallback | boolean | true | Fetch post pages when archive content is insufficient |
| maxAiCharactersPerPost | integer | 12000 | Max content length sent to AI per post |
| requestDelayMs | integer | 800 | Delay between requests in milliseconds |
| includeSummaryRecordInDataset | boolean | true | Include summary record in dataset output |

### Important Notes

- This Actor extracts data from Substack's public, unofficial API endpoints. These endpoints are not officially documented and may change without notice. The Actor includes defensive handling for API changes, but some newsletters may occasionally return incomplete data.
- Only publicly accessible content is analyzed. Paywalled posts return metadata but not full content.
- Custom domain newsletters (e.g., `www.lennysnewsletter.com` instead of `lenny.substack.com`) are fully supported.
- AI enhanced mode requires an `OPENAI_API_KEY` environment variable set in your Actor configuration.
- The Actor processes newsletters sequentially to respect rate limits.

### Use Cases

**Sponsor Intelligence:** "Which brands are sponsoring the top 50 fintech newsletters right now?" Run the Actor against a list of newsletter URLs and get a structured dataset of every sponsor mention.

**Competitive Analysis:** "Are my competitors advertising in newsletters I should know about?" Track specific brand names across newsletter archives.

**Media Buying Research:** "What's the going rate for sponsoring a newsletter with 50K subscribers?" Cross-reference sponsor frequency with subscriber estimates to understand market pricing patterns.

**Brand Monitoring:** "How often does Brand X sponsor newsletters, and which ones?" Set up scheduled runs to track sponsorship activity over time.

### Pricing

This Actor supports pay-per-event pricing on the Apify platform. See the Pricing tab for current rates.

### Feedback

If you find a newsletter that should be working but isn't, or if you have feature requests, please open an issue on the Actor's Issues tab.

# Actor input Schema

## `newsletterUrls` (type: `array`):

List of Substack newsletter URLs to analyze

## `maxPostsPerNewsletter` (type: `integer`):

How many recent posts to analyze per newsletter (0 = all)

## `sinceDate` (type: `string`):

Only analyze posts published after this date (YYYY-MM-DD)

## `detectionMode` (type: `string`):

pattern\_match is cheaper; ai\_enhanced uses LLM for better detection

## `enableCanonicalFallback` (type: `boolean`):

When archive API returns no post content, fetch the post's canonical URL to extract content. Slightly slower but dramatically improves detection coverage.

## `maxAiCharactersPerPost` (type: `integer`):

Maximum characters of cleaned post content to send to AI in ai\_enhanced mode

## `requestDelayMs` (type: `integer`):

Delay between archive requests per newsletter

## `chargePerNewsletter` (type: `boolean`):

If true, charge Apify PPE event for each newsletter analyzed

## `chargePerAiPost` (type: `boolean`):

If true, charge Apify PPE event for each post analyzed in ai\_enhanced mode

## `includeSummaryRecordInDataset` (type: `boolean`):

If true, push a final summary record into the dataset

## `includePostsWithNoSponsors` (type: `boolean`):

If true, include a row for every analyzed post, even those with no sponsors

## Actor input object example

```json
{
  "maxPostsPerNewsletter": 50,
  "detectionMode": "pattern_match",
  "enableCanonicalFallback": true,
  "maxAiCharactersPerPost": 12000,
  "requestDelayMs": 800,
  "chargePerNewsletter": false,
  "chargePerAiPost": false,
  "includeSummaryRecordInDataset": true,
  "includePostsWithNoSponsors": false
}
```

# Actor output Schema

## `results` (type: `string`):

Sponsor detection rows and optional summary row stored in the default dataset.

## `summary` (type: `string`):

Run summary stored in the default key-value store under SUMMARY.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {};

// Run the Actor and wait for it to finish
const run = await client.actor("dramatic_jonquil/substack-sponsor-intelligence").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {}

# Run the Actor and wait for it to finish
run = client.actor("dramatic_jonquil/substack-sponsor-intelligence").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{}' |
apify call dramatic_jonquil/substack-sponsor-intelligence --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=dramatic_jonquil/substack-sponsor-intelligence",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Substack Sponsor Intelligence",
        "description": "Analyze Substack newsletters to extract sponsor data, detect monetization patterns, and generate structured intelligence using pattern matching and AI.",
        "version": "1.0",
        "x-build-id": "3NSLyi2EiuUENVfYm"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/dramatic_jonquil~substack-sponsor-intelligence/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-dramatic_jonquil-substack-sponsor-intelligence",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/dramatic_jonquil~substack-sponsor-intelligence/runs": {
            "post": {
                "operationId": "runs-sync-dramatic_jonquil-substack-sponsor-intelligence",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/dramatic_jonquil~substack-sponsor-intelligence/run-sync": {
            "post": {
                "operationId": "run-sync-dramatic_jonquil-substack-sponsor-intelligence",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "newsletterUrls"
                ],
                "properties": {
                    "newsletterUrls": {
                        "title": "Newsletter URLs",
                        "type": "array",
                        "description": "List of Substack newsletter URLs to analyze",
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxPostsPerNewsletter": {
                        "title": "Max Posts Per Newsletter",
                        "type": "integer",
                        "description": "How many recent posts to analyze per newsletter (0 = all)",
                        "default": 50
                    },
                    "sinceDate": {
                        "title": "Analyze Posts Since",
                        "type": "string",
                        "description": "Only analyze posts published after this date (YYYY-MM-DD)"
                    },
                    "detectionMode": {
                        "title": "Detection Mode",
                        "enum": [
                            "pattern_match",
                            "ai_enhanced"
                        ],
                        "type": "string",
                        "description": "pattern_match is cheaper; ai_enhanced uses LLM for better detection",
                        "default": "pattern_match"
                    },
                    "enableCanonicalFallback": {
                        "title": "Enable Canonical URL Fallback",
                        "type": "boolean",
                        "description": "When archive API returns no post content, fetch the post's canonical URL to extract content. Slightly slower but dramatically improves detection coverage.",
                        "default": true
                    },
                    "maxAiCharactersPerPost": {
                        "title": "Max AI Characters Per Post",
                        "type": "integer",
                        "description": "Maximum characters of cleaned post content to send to AI in ai_enhanced mode",
                        "default": 12000
                    },
                    "requestDelayMs": {
                        "title": "Request Delay (ms)",
                        "type": "integer",
                        "description": "Delay between archive requests per newsletter",
                        "default": 800
                    },
                    "chargePerNewsletter": {
                        "title": "Charge Per Newsletter",
                        "type": "boolean",
                        "description": "If true, charge Apify PPE event for each newsletter analyzed",
                        "default": false
                    },
                    "chargePerAiPost": {
                        "title": "Charge Per AI Post",
                        "type": "boolean",
                        "description": "If true, charge Apify PPE event for each post analyzed in ai_enhanced mode",
                        "default": false
                    },
                    "includeSummaryRecordInDataset": {
                        "title": "Include Summary Record In Dataset",
                        "type": "boolean",
                        "description": "If true, push a final summary record into the dataset",
                        "default": true
                    },
                    "includePostsWithNoSponsors": {
                        "title": "Include Posts With No Sponsors",
                        "type": "boolean",
                        "description": "If true, include a row for every analyzed post, even those with no sponsors",
                        "default": false
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
