Beehiiv Newsletter Scraper - Posts & Authors
Pricing
from $5.00 / 1,000 results
Beehiiv Newsletter Scraper - Posts & Authors
Scrape public Beehiiv newsletters by publication URL, custom domain, sitemap, or post URL. Extract posts, authors, full text, HTML, markdown, images, outbound links, sponsor links, and publication metadata.
Pricing
from $5.00 / 1,000 results
Rating
0.0
(0)
Developer
Elliot Padfield
Maintained by CommunityActor stats
0
Bookmarked
1
Total users
0
Monthly active users
18 hours ago
Last modified
Categories
Share
Scrape public Beehiiv newsletters by publication URL, custom domain, sitemap, or direct post URL. This Actor extracts structured Beehiiv post data including titles, descriptions, publish dates, authors, publication metadata, full article text, HTML, markdown-like text, images, outbound links, sponsor-looking links, word count, and reading time.
Use it to monitor newsletters, research newsletter sponsorships, build content datasets, track competitor publishing, analyze Beehiiv creators, collect newsletter archives, and export Beehiiv post data to CSV, JSON, Excel, Google Sheets, Make, Zapier, or your own API workflow.
What can this Beehiiv scraper do?
- Scrape any public Beehiiv publication URL or custom domain
- Discover posts from Beehiiv publication sitemaps
- Scrape direct Beehiiv post URLs
- Extract full post content as HTML, clean text, and markdown-like text
- Extract author name, author URL, bio, image, and social links when available
- Extract publication name, URL, ID, logo, image, and social links when available
- Extract primary image and body images
- Extract outbound links from the article body
- Flag sponsor, affiliate, referral, and campaign-looking URLs
- Filter saved posts by keyword
- Filter saved posts by published date
- Deduplicate posts across publications, sitemaps, and direct URLs
- Export structured post data to Apify datasets
- Run on schedules for newsletter monitoring
- Use Apify residential proxies on every run for production reliability
What data can you extract from Beehiiv?
| Field | Description |
|---|---|
publicationName | Beehiiv publication name |
publicationUrl | Publication URL from structured metadata |
publicationId | Beehiiv publication identifier when available |
publicationLogoUrl | Publication logo URL |
publicationSocialUrls | Publication social profile URLs |
postId | Beehiiv post identifier when available |
postUrl | URL fetched by the Actor |
canonicalUrl | Canonical Beehiiv post URL |
slug | Post slug |
title | Post title |
description | Post description or excerpt |
authorName | Author name |
authorUrl | Author URL |
authorDescription | Author bio or description |
authorImageUrl | Author image URL |
datePublished | Published timestamp |
dateModified | Last modified timestamp |
isAccessibleForFree | Public/free flag from structured metadata |
imageUrl | Primary post image |
tags | Tags detected from Beehiiv tag links |
html | Full article body HTML |
text | Clean article body text |
markdown | Markdown-like article text for AI and analysis workflows |
imageUrls | Primary and embedded image URLs |
outboundUrls | Links found in the article body |
sponsorUrls | Sponsor, affiliate, referral, or campaign-looking links |
wordCount | Article word count |
readingTimeMinutes | Estimated reading time |
matchedKeywords | Keywords that matched the saved post |
contentFetched | Whether the Actor found and extracted a full article body |
scrapedAt | Timestamp when the row was saved |
How to scrape Beehiiv newsletters
- Add one or more Beehiiv publication URLs, archive URLs, custom domains, sitemap URLs, or direct post URLs.
- Set
maxPoststo control how many posts to save. - Add
keywords,dateFrom, ordateToif you only want matching posts. - Keep
includeFullContent,includeImages, andincludeLinksenabled for the richest dataset. - Run the Actor and export the dataset in JSON, CSV, Excel, XML, RSS, or HTML from Apify.
Input examples
Scrape a Beehiiv publication
{"publicationUrls": ["https://product.beehiiv.com"],"maxPosts": 100,"includeFullContent": true,"includeImages": true,"includeLinks": true}
Scrape a custom domain and filter by keyword
{"publicationUrls": ["https://www.example-newsletter.com"],"keywords": ["AI", "funding", "sponsor"],"dateFrom": "2026-01-01","maxPosts": 250}
Enrich specific Beehiiv post URLs
{"postUrls": ["https://product.beehiiv.com/p/beehiiv-mcp-v2"],"includeFullContent": true}
Output example
{"sourceType": "publication","sourceValue": "https://product.beehiiv.com","publicationName": "beehiiv Product Updates","publicationUrl": "https://product.beehiiv.com/","postId": "c5f6f5e5-...","postUrl": "https://product.beehiiv.com/p/beehiiv-mcp-v2","canonicalUrl": "https://product.beehiiv.com/p/beehiiv-mcp-v2","slug": "beehiiv-mcp-v2","title": "Introducing beehiiv MCP v2","description": "A product update from beehiiv.","authorName": "beehiiv","datePublished": "2026-05-20T12:00:00.000Z","imageUrl": "https://media.beehiiv.com/...","text": "Full article text...","markdown": "Full article text...","outboundUrls": ["https://www.beehiiv.com/..."],"sponsorUrls": [],"wordCount": 742,"readingTimeMinutes": 4,"scrapedAt": "2026-05-28T10:15:00.000Z"}
Search methods and filters
| Capability | Supported |
|---|---|
| Publication URL discovery | Yes |
| Beehiiv custom domains | Yes |
| Direct sitemap URL scraping | Yes |
| Direct post URL enrichment | Yes |
| Keyword filtering | Yes |
| Date range filtering | Yes |
| Full HTML extraction | Yes |
| Clean text extraction | Yes |
| Markdown-like text extraction | Yes |
| Author metadata | Yes |
| Publication metadata | Yes |
| Image extraction | Yes |
| Outbound link extraction | Yes |
| Sponsor or affiliate link detection | Yes |
| Word count and reading time | Yes |
| Deduplication across inputs | Yes |
| Forced Apify Residential Proxy | Yes |
Pricing
This Actor is designed for pay-per-result pricing. Each saved Beehiiv post is one billable result.
| Result type | What counts as one result |
|---|---|
| Beehiiv post | One saved post row after deduplication, keyword filtering, and date filtering |
A typical run can scrape the latest 100 Beehiiv posts from a publication in a few minutes. Failed post fetches, duplicate URLs, and posts filtered out by keyword/date are not saved as dataset items. The Actor stops saving new rows when the Apify pay-per-result charge limit is reached.
The Actor always uses Apify residential proxies. For small tests, lower maxPosts to 10 or 25. For scheduled monitoring, run daily or weekly with the same publication inputs and deduplicate by postId or canonicalUrl in your downstream workflow.
Reliability notes
The Actor is fault tolerant across sources and posts. If one sitemap or post URL is blocked, deleted, or returns a Cloudflare challenge, the run logs the failure and continues with the remaining inputs. Article body extraction uses Beehiiv's common content container first, then falls back to broader article and main-content selectors; rows where metadata is available but full article text is not will have contentFetched: false.
Why use this Actor?
Beehiiv newsletters are useful for content research, sponsorship intelligence, creator discovery, competitor monitoring, and AI-ready content analysis. This scraper helps answer questions like:
- What has a Beehiiv publication published recently?
- Which authors write for a newsletter?
- Which outbound links, sponsors, and affiliate campaigns appear in posts?
- Which newsletters mention a brand, topic, product, or competitor?
- How long are posts, and how frequently does a publication publish?
- Which Beehiiv posts are best suited for content analysis or lead research?
Because it runs on Apify, you also get scheduling, API access, datasets, webhooks, proxy rotation, and integrations without maintaining your own server.