Beehiiv Newsletter Scraper - Posts & Authors avatar

Beehiiv Newsletter Scraper - Posts & Authors

Pricing

from $5.00 / 1,000 results

Go to Apify Store
Beehiiv Newsletter Scraper - Posts & Authors

Beehiiv Newsletter Scraper - Posts & Authors

Scrape public Beehiiv newsletters by publication URL, custom domain, sitemap, or post URL. Extract posts, authors, full text, HTML, markdown, images, outbound links, sponsor links, and publication metadata.

Pricing

from $5.00 / 1,000 results

Rating

0.0

(0)

Developer

Elliot Padfield

Elliot Padfield

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

18 hours ago

Last modified

Share

Scrape public Beehiiv newsletters by publication URL, custom domain, sitemap, or direct post URL. This Actor extracts structured Beehiiv post data including titles, descriptions, publish dates, authors, publication metadata, full article text, HTML, markdown-like text, images, outbound links, sponsor-looking links, word count, and reading time.

Use it to monitor newsletters, research newsletter sponsorships, build content datasets, track competitor publishing, analyze Beehiiv creators, collect newsletter archives, and export Beehiiv post data to CSV, JSON, Excel, Google Sheets, Make, Zapier, or your own API workflow.

What can this Beehiiv scraper do?

  • Scrape any public Beehiiv publication URL or custom domain
  • Discover posts from Beehiiv publication sitemaps
  • Scrape direct Beehiiv post URLs
  • Extract full post content as HTML, clean text, and markdown-like text
  • Extract author name, author URL, bio, image, and social links when available
  • Extract publication name, URL, ID, logo, image, and social links when available
  • Extract primary image and body images
  • Extract outbound links from the article body
  • Flag sponsor, affiliate, referral, and campaign-looking URLs
  • Filter saved posts by keyword
  • Filter saved posts by published date
  • Deduplicate posts across publications, sitemaps, and direct URLs
  • Export structured post data to Apify datasets
  • Run on schedules for newsletter monitoring
  • Use Apify residential proxies on every run for production reliability

What data can you extract from Beehiiv?

FieldDescription
publicationNameBeehiiv publication name
publicationUrlPublication URL from structured metadata
publicationIdBeehiiv publication identifier when available
publicationLogoUrlPublication logo URL
publicationSocialUrlsPublication social profile URLs
postIdBeehiiv post identifier when available
postUrlURL fetched by the Actor
canonicalUrlCanonical Beehiiv post URL
slugPost slug
titlePost title
descriptionPost description or excerpt
authorNameAuthor name
authorUrlAuthor URL
authorDescriptionAuthor bio or description
authorImageUrlAuthor image URL
datePublishedPublished timestamp
dateModifiedLast modified timestamp
isAccessibleForFreePublic/free flag from structured metadata
imageUrlPrimary post image
tagsTags detected from Beehiiv tag links
htmlFull article body HTML
textClean article body text
markdownMarkdown-like article text for AI and analysis workflows
imageUrlsPrimary and embedded image URLs
outboundUrlsLinks found in the article body
sponsorUrlsSponsor, affiliate, referral, or campaign-looking links
wordCountArticle word count
readingTimeMinutesEstimated reading time
matchedKeywordsKeywords that matched the saved post
contentFetchedWhether the Actor found and extracted a full article body
scrapedAtTimestamp when the row was saved

How to scrape Beehiiv newsletters

  1. Add one or more Beehiiv publication URLs, archive URLs, custom domains, sitemap URLs, or direct post URLs.
  2. Set maxPosts to control how many posts to save.
  3. Add keywords, dateFrom, or dateTo if you only want matching posts.
  4. Keep includeFullContent, includeImages, and includeLinks enabled for the richest dataset.
  5. Run the Actor and export the dataset in JSON, CSV, Excel, XML, RSS, or HTML from Apify.

Input examples

Scrape a Beehiiv publication

{
"publicationUrls": ["https://product.beehiiv.com"],
"maxPosts": 100,
"includeFullContent": true,
"includeImages": true,
"includeLinks": true
}

Scrape a custom domain and filter by keyword

{
"publicationUrls": ["https://www.example-newsletter.com"],
"keywords": ["AI", "funding", "sponsor"],
"dateFrom": "2026-01-01",
"maxPosts": 250
}

Enrich specific Beehiiv post URLs

{
"postUrls": [
"https://product.beehiiv.com/p/beehiiv-mcp-v2"
],
"includeFullContent": true
}

Output example

{
"sourceType": "publication",
"sourceValue": "https://product.beehiiv.com",
"publicationName": "beehiiv Product Updates",
"publicationUrl": "https://product.beehiiv.com/",
"postId": "c5f6f5e5-...",
"postUrl": "https://product.beehiiv.com/p/beehiiv-mcp-v2",
"canonicalUrl": "https://product.beehiiv.com/p/beehiiv-mcp-v2",
"slug": "beehiiv-mcp-v2",
"title": "Introducing beehiiv MCP v2",
"description": "A product update from beehiiv.",
"authorName": "beehiiv",
"datePublished": "2026-05-20T12:00:00.000Z",
"imageUrl": "https://media.beehiiv.com/...",
"text": "Full article text...",
"markdown": "Full article text...",
"outboundUrls": ["https://www.beehiiv.com/..."],
"sponsorUrls": [],
"wordCount": 742,
"readingTimeMinutes": 4,
"scrapedAt": "2026-05-28T10:15:00.000Z"
}

Search methods and filters

CapabilitySupported
Publication URL discoveryYes
Beehiiv custom domainsYes
Direct sitemap URL scrapingYes
Direct post URL enrichmentYes
Keyword filteringYes
Date range filteringYes
Full HTML extractionYes
Clean text extractionYes
Markdown-like text extractionYes
Author metadataYes
Publication metadataYes
Image extractionYes
Outbound link extractionYes
Sponsor or affiliate link detectionYes
Word count and reading timeYes
Deduplication across inputsYes
Forced Apify Residential ProxyYes

Pricing

This Actor is designed for pay-per-result pricing. Each saved Beehiiv post is one billable result.

Result typeWhat counts as one result
Beehiiv postOne saved post row after deduplication, keyword filtering, and date filtering

A typical run can scrape the latest 100 Beehiiv posts from a publication in a few minutes. Failed post fetches, duplicate URLs, and posts filtered out by keyword/date are not saved as dataset items. The Actor stops saving new rows when the Apify pay-per-result charge limit is reached.

The Actor always uses Apify residential proxies. For small tests, lower maxPosts to 10 or 25. For scheduled monitoring, run daily or weekly with the same publication inputs and deduplicate by postId or canonicalUrl in your downstream workflow.

Reliability notes

The Actor is fault tolerant across sources and posts. If one sitemap or post URL is blocked, deleted, or returns a Cloudflare challenge, the run logs the failure and continues with the remaining inputs. Article body extraction uses Beehiiv's common content container first, then falls back to broader article and main-content selectors; rows where metadata is available but full article text is not will have contentFetched: false.

Why use this Actor?

Beehiiv newsletters are useful for content research, sponsorship intelligence, creator discovery, competitor monitoring, and AI-ready content analysis. This scraper helps answer questions like:

  • What has a Beehiiv publication published recently?
  • Which authors write for a newsletter?
  • Which outbound links, sponsors, and affiliate campaigns appear in posts?
  • Which newsletters mention a brand, topic, product, or competitor?
  • How long are posts, and how frequently does a publication publish?
  • Which Beehiiv posts are best suited for content analysis or lead research?

Because it runs on Apify, you also get scheduling, API access, datasets, webhooks, proxy rotation, and integrations without maintaining your own server.