Substack Leaderboard Scraper avatar

Substack Leaderboard Scraper

Pricing

Pay per event

Go to Apify Store
Substack Leaderboard Scraper

Substack Leaderboard Scraper

๐Ÿ“Š Scrape public Substack leaderboards for ranked newsletters, author details, subscriber labels, and publication URLs.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Stas Persiianenko

Stas Persiianenko

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Categories

Share

Find ranked Substack publications from public category leaderboards. Export bestseller and rising newsletters with publication URLs, subscriber labels, author details, descriptions, and ranking context.

Use this actor when you need a clean dataset for newsletter sponsorship research, creator discovery, competitive intelligence, media lists, or partnership prospecting.

What does Substack Leaderboard Scraper do?

Substack Leaderboard Scraper collects public rows from Substack category leaderboards such as Technology, Business, Culture, Finance, Food & Drink, News, and more.

It uses public Substack leaderboard data and saves one dataset row per ranked publication.

Typical results include:

  • ๐Ÿ† leaderboard rank
  • ๐Ÿ—‚๏ธ category name and slug
  • ๐Ÿ“ˆ ranking tab: Top Bestsellers or Rising
  • ๐Ÿ“ฐ publication name and URL
  • ๐Ÿ‘ค author name and profile URL
  • ๐Ÿ‘ฅ subscriber labels such as thousands of paid subscribers
  • ๐Ÿ”— Substack hostname and subdomain
  • ๐Ÿงญ source leaderboard URL

Who is it for?

Sponsorship and growth teams

Use the dataset to discover newsletters that already have audience traction in a niche.

Creator partnership teams

Find creators by category and collect publication metadata before outreach.

Newsletter operators

Monitor adjacent categories to understand who is rising and how top publications position themselves.

Market researchers

Build a structured view of the Substack creator market by category.

Agencies and media buyers

Export publication URLs, authors, subscriber labels, and descriptions for campaign planning.

Why use this actor?

Substack leaderboards are useful, but they are built for browsing, not analysis. This actor turns those public pages into structured rows that can be filtered, joined, deduplicated, and exported.

Benefits:

  • โšก HTTP-only scraping for fast low-cost runs
  • ๐ŸŽฏ category slug input instead of internal category IDs
  • ๐Ÿ“Š bestseller and rising ranking tabs
  • ๐Ÿงพ dataset rows ready for CSV, JSON, Excel, Airtable, or CRM imports
  • ๐Ÿ” repeatable monitoring of the same categories over time

What data can you extract?

FieldDescription
categoryNameHuman-readable leaderboard category
rankingLabelTop Bestsellers or Rising
rankRank within that category/ranking page
publicationNameSubstack publication name
publicationUrlPublic publication URL
descriptionPublic publication description or hero text
authorNamePublic author name when available
authorUrlPublic Substack profile URL
paidSubscriberLabelPaid subscriber range label from Substack
subscriberLabelBroader subscriber label when available
freeSubscriberCountFree subscriber count text when exposed
hasPodcastWhether the publication has podcast support
twitterScreenNameTwitter/X screen name when exposed
sourceUrlLeaderboard URL that produced the row

How much does it cost to scrape Substack leaderboard rows?

Pricing is pay per event:

  • Start event: $0.005 per run
  • Leaderboard row event: starts at about $0.00018 per saved row on the BRONZE tier, with lower per-row prices on higher Apify tiers

That means 1,000 saved leaderboard rows cost about $0.18 on the BRONZE tier plus the small run start fee before Apify platform charges or plan-specific details.

Quick start

  1. Open the actor on Apify.
  2. Enter one or more category slugs, for example technology and business.
  3. Choose paid, rising, or both ranking tabs.
  4. Set a small maxItems for your first run.
  5. Start the actor.
  6. Export the dataset as CSV, JSON, or Excel.

Input options

categorySlugs

List of Substack leaderboard category slugs.

Examples:

  • technology
  • business
  • culture
  • finance
  • news
  • food

startUrls

Optional direct leaderboard URLs.

Examples:

  • https://substack.com/leaderboard/technology
  • https://substack.com/leaderboard/technology/rising
  • https://substack.com/leaderboard/business/paid

rankings

Choose one or both:

  • paid for Top Bestsellers
  • rising for Rising publications

maxItems

Maximum rows saved across all selected categories and ranking tabs.

includeAllCategories

Set this to true to scrape every public category returned by Substack's leaderboard category API. Keep maxItems modest for the first run.

Example input

{
"categorySlugs": ["technology", "business"],
"rankings": ["paid", "rising"],
"maxItems": 100,
"includeAllCategories": false
}

Example output

{
"category": "technology",
"categoryName": "Technology",
"categoryId": 4,
"rankingType": "paid",
"rankingLabel": "Top Bestsellers",
"rank": 1,
"publicationId": 6349492,
"publicationName": "SemiAnalysis",
"publicationUrl": "https://newsletter.semianalysis.com",
"description": "Bridging the gap between the world's most important industry, semiconductors, and business.",
"authorName": "Dylan Patel",
"authorHandle": "semianalysis",
"authorUrl": "https://substack.com/@semianalysis",
"paidSubscriberLabel": "Thousands of paid subscribers",
"subscriberLabel": "Hundreds of thousands of subscribers",
"freeSubscriberCount": "287,000",
"hasPodcast": false,
"sourceUrl": "https://substack.com/leaderboard/technology/paid"
}

Tips for better results

  • Start with one or two categories.
  • Use both paid and rising when you want mature and emerging publications.
  • Use maxItems to control cost and dataset size.
  • Run the same input weekly to monitor ranking changes.
  • Combine with your CRM or spreadsheet to track outreach status.

Integrations

Google Sheets

Export the dataset as CSV and import it into Google Sheets for review and tagging.

Airtable

Use the Apify integration to sync publication rows into an Airtable base.

CRM systems

Use publication URLs, author names, and profile URLs as enrichment inputs for sponsorship outreach.

BI dashboards

Track category rank, subscriber labels, and rising publications over time.

API usage

Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: process.env.APIFY_TOKEN });
const run = await client.actor('automation-lab/substack-leaderboard-scraper').call({
categorySlugs: ['technology'],
rankings: ['paid', 'rising'],
maxItems: 50
});
console.log(run.defaultDatasetId);

Python

from apify_client import ApifyClient
import os
client = ApifyClient(os.environ['APIFY_TOKEN'])
run = client.actor('automation-lab/substack-leaderboard-scraper').call(run_input={
'categorySlugs': ['technology'],
'rankings': ['paid', 'rising'],
'maxItems': 50,
})
print(run['defaultDatasetId'])

cURL

curl -X POST "https://api.apify.com/v2/acts/automation-lab~substack-leaderboard-scraper/runs?token=$APIFY_TOKEN" \
-H 'Content-Type: application/json' \
-d '{"categorySlugs":["technology"],"rankings":["paid"],"maxItems":25}'

MCP usage

You can use this actor through Apify MCP tools in Claude Desktop or Claude Code.

MCP URL:

https://mcp.apify.com/?tools=automation-lab/substack-leaderboard-scraper

Claude Code quick add:

$claude mcp add apify-substack-leaderboard https://mcp.apify.com/?tools=automation-lab/substack-leaderboard-scraper

Claude Desktop / JSON MCP config:

{
"mcpServers": {
"apify-substack-leaderboard": {
"url": "https://mcp.apify.com/?tools=automation-lab/substack-leaderboard-scraper"
}
}
}

Example prompts:

  • "Scrape the Technology and Business Substack leaderboards and summarize top sponsorship targets."
  • "Find rising Substack newsletters in Finance and return publication URLs with subscriber labels."
  • "Export top Culture newsletters and group them by author details."

Data quality notes

Substack exposes subscriber counts as labels and rounded text, not always exact numbers. The actor preserves those public labels and adds magnitude fields when Substack provides them.

Some publications may not expose a Twitter/X handle, author bio, or podcast flag. Those fields are returned as null when unavailable.

FAQ

Troubleshooting

Why did I get fewer rows than maxItems?

The selected category/ranking combination may have fewer public rows than your limit, or the actor reached the end of available leaderboard pages.

Why are subscriber counts rounded?

Substack leaderboards typically show public ranges or rounded counts. The actor does not infer private exact subscriber totals.

Why was a category skipped?

Use the category slug from the public leaderboard URL. If Substack does not return that slug in its leaderboard category API, the actor skips it and logs a warning.

Legality

This actor collects publicly available information from public Substack leaderboard endpoints. You are responsible for using the data lawfully, respecting applicable terms, privacy rules, and outreach regulations.

Yes, the actor is designed for public leaderboard data only. It does not access private dashboards, subscriber lists, paid posts, or account-only content.

Other automation-lab actors that may fit the same workflow:

Changelog

0.1

Initial version with public Substack category leaderboards, bestseller and rising ranking tabs, subscriber labels, author details, and publication URLs.

Limitations

The actor focuses on leaderboard rows. It does not scrape individual posts, paid content, private subscriber lists, or account-only dashboards.

Support

If a public Substack leaderboard category stops working, include the category slug, input JSON, run ID, and expected output when reporting the issue.