Pricing

Pay per usage

Go to Apify Store

Medium Scraper

Try for free

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Ricardo Akiyoshi

Actor stats

Bookmarked

Total users

Monthly active users

2 months ago

Last modified

Medium Article Scraper

Scrape Medium articles by search term, tag, or author profile. This actor extracts comprehensive article data including title, subtitle, full text content, author information, publication details, claps, responses, reading time, tags, and member-only status.

Built with Crawlee and CheerioCrawler for fast, low-resource HTML scraping. Uses 4 independent extraction strategies (JSON-LD, Apollo state, DOM parsing, and meta tags) to maximize data reliability across Medium's dynamic layouts.

Features

Search by keyword — Find articles matching any search term
Filter by tag — Browse Medium tags like javascript, data-science, or startup
Scrape author profiles — Extract all articles from a specific Medium author
Full article text — Extracts the complete article body (or excerpt for member-only content)
Member-only detection — Identifies paywalled articles automatically
4 extraction strategies — JSON-LD, Apollo/inline scripts, DOM, and meta tags
Deduplication — Automatically removes duplicate articles by URL and title
Proxy support — Rotate IPs to avoid rate limiting on large scrapes
Pay-per-event pricing — Only pay for articles successfully scraped ($0.003/article)

Use Cases

Content Research & Ideation

Find trending articles in your niche to discover what topics resonate with readers. Analyze titles, subtitles, and tags to inform your content strategy and identify gaps in existing coverage.

Trend Analysis & Market Intelligence

Track how specific topics evolve over time on Medium. Monitor clap counts and response rates to gauge audience engagement. Identify emerging trends before they go mainstream.

Competitive Analysis

Study what competitors and industry leaders are publishing. Analyze their posting frequency, popular topics, engagement metrics, and which publications they write for. Benchmark your content performance.

Academic & Journalistic Research

Collect articles on a specific subject for literature reviews, background research, or sourcing. Extract full text and metadata for structured analysis.

SEO & Content Marketing

Discover high-performing content formats and topics. Analyze which tags drive the most engagement. Find potential collaboration opportunities with popular Medium authors.

Data Science & NLP Training

Build datasets of categorized, tagged articles for natural language processing, sentiment analysis, topic modeling, or text classification projects.

Input Parameters

Parameter	Type	Required	Default	Description
`searchTerm`	string	Yes	—	Keyword or phrase to search for on Medium
`tag`	string	No	—	Medium tag to filter by (e.g., `programming`, `data-science`)
`author`	string	No	—	Author username without `@` (e.g., `elonmusk`)
`maxResults`	integer	No	50	Maximum articles to scrape (1–5,000)
`sortBy`	enum	No	`relevance`	Sort order: `relevance`, `recent`, or `popular`
`proxyConfiguration`	object	No	Apify Proxy	Proxy settings for IP rotation

Note: At least one of searchTerm, tag, or author must be provided. You can combine all three for more targeted results.

Output Schema

Each scraped article produces a JSON object with these fields:

{
  "title": "How I Built a Million-Dollar SaaS in 12 Months",
  "subtitle": "A step-by-step breakdown of the strategy that worked",
  "author": "Jane Developer",
  "authorUrl": "https://medium.com/@janedev",
  "publication": "Better Programming",
  "publishDate": "2024-03-15T00:00:00.000Z",
  "readingTime": "8 min read",
  "claps": 4200,
  "responses": 87,
  "content": "Full article text extracted from the page...",
  "tags": ["startup", "saas", "entrepreneurship"],
  "imageUrl": "https://miro.medium.com/v2/resize:fit:1200/image.jpeg",
  "articleUrl": "https://medium.com/@janedev/how-i-built-a-million-dollar-saas-abc123def456",
  "memberOnly": false,
  "extractionMethods": ["json-ld", "dom", "meta-tags"],
  "scrapedAt": "2024-03-20T14:30:00.000Z"
}

Field Reference

Field	Type	Description
`title`	string	Article headline
`subtitle`	string	Article subtitle or description
`author`	string	Author display name
`authorUrl`	string	Link to author's Medium profile
`publication`	string	Publication name (e.g., "Towards Data Science")
`publishDate`	string	ISO 8601 publication date
`readingTime`	string	Estimated reading time (e.g., "5 min read")
`claps`	integer	Number of claps (likes)
`responses`	integer	Number of comments/responses
`content`	string	Full article text (may be truncated for member-only articles)
`tags`	array	List of topic tags
`imageUrl`	string	Featured/header image URL
`articleUrl`	string	Canonical article URL
`memberOnly`	boolean	Whether the article is behind Medium's paywall
`extractionMethods`	array	Which extraction strategies produced data
`scrapedAt`	string	ISO 8601 timestamp of when the article was scraped

Code Examples

Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")

run_input = {
    "searchTerm": "machine learning",
    "tag": "artificial-intelligence",
    "maxResults": 100,
    "sortBy": "popular",
    "proxyConfiguration": {"useApifyProxy": True},
}

run = client.actor("sovereigntaylor/medium-scraper").call(run_input=run_input)

print(f"Scraping complete. Dataset ID: {run['defaultDatasetId']}")

# Iterate over results
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(f"{item['title']} — {item['claps']} claps — {item['readingTime']}")

Python — Export to Pandas DataFrame

from apify_client import ApifyClient
import pandas as pd

client = ApifyClient("YOUR_API_TOKEN")

run_input = {
    "searchTerm": "startup advice",
    "maxResults": 200,
    "sortBy": "popular",
}

run = client.actor("sovereigntaylor/medium-scraper").call(run_input=run_input)

items = list(client.dataset(run["defaultDatasetId"]).iterate_items())
df = pd.DataFrame(items)

# Analyze engagement
print(f"Total articles: {len(df)}")
print(f"Average claps: {df['claps'].mean():.0f}")
print(f"Top tags: {df['tags'].explode().value_counts().head(10)}")

# Export
df.to_csv("medium_articles.csv", index=False)

JavaScript (Node.js)

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });

const input = {
    searchTerm: 'web development',
    tag: 'javascript',
    maxResults: 50,
    sortBy: 'recent',
    proxyConfiguration: { useApifyProxy: true },
};

const run = await client.actor('sovereigntaylor/medium-scraper').call(input);

console.log(`Dataset ID: ${run.defaultDatasetId}`);

const { items } = await client.dataset(run.defaultDatasetId).listItems();

for (const item of items) {
    console.log(`${item.title} by ${item.author} (${item.claps} claps)`);
}

JavaScript — Filter Member-Only Articles

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });

const run = await client.actor('sovereigntaylor/medium-scraper').call({
    searchTerm: 'product management',
    maxResults: 100,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();

const freeArticles = items.filter(item => !item.memberOnly);
const paidArticles = items.filter(item => item.memberOnly);

console.log(`Free articles: ${freeArticles.length}`);
console.log(`Member-only articles: ${paidArticles.length}`);

// Top free articles by claps
freeArticles
    .sort((a, b) => b.claps - a.claps)
    .slice(0, 10)
    .forEach((a, i) => console.log(`${i + 1}. ${a.title} (${a.claps} claps)`));

Tips & Best Practices

Use proxies for large scrapes. Medium may rate-limit or block datacenter IPs. Enable Apify Proxy with residential IPs for scrapes over 100 articles.
Combine sources for broader coverage. Use searchTerm + tag + author together to cast a wider net and get more diverse results.
Start small. Test with maxResults: 10 first to verify the output format meets your needs before running large scrapes.
Member-only content. The scraper can detect member-only articles but may only extract partial content (subtitle/excerpt) for paywalled articles.
Rate limiting. The scraper automatically throttles requests to avoid being blocked. Increasing maxResults will increase run time proportionally.
Tag formatting. Tags should be lowercase with hyphens (e.g., data-science, not Data Science). The scraper normalizes tags automatically.

FAQ

Q: Can this scraper extract full text from member-only articles? A: The scraper extracts whatever content is available in the HTML. For member-only articles, this is typically the subtitle and first few paragraphs. Full text requires a Medium membership session, which this scraper does not support.

Q: How many articles can I scrape per run? A: Up to 5,000 articles per run. For larger datasets, run the actor multiple times with different search terms or time ranges.

Q: Why are some fields empty? A: Medium's page structure varies across different article templates, publications, and A/B tests. The scraper uses 4 extraction strategies to maximize coverage, but some fields may not be available on every article.

Q: How much does it cost? A: The actor uses pay-per-event pricing at $0.003 per article scraped. A 100-article scrape costs $0.30. You also pay standard Apify platform compute costs (typically $0.01–0.05 per run depending on duration).

Q: Can I scrape articles from custom domain publications? A: Yes, if the publication uses a custom domain (e.g., blog.company.com) but is hosted on Medium, the scraper can extract articles from those pages as well when they appear in search results.

Q: How often can I run this scraper? A: You can schedule it to run as often as needed. For monitoring use cases, daily or weekly runs are common. Use Apify's scheduling feature to automate recurring scrapes.

Q: Does this work with Medium's API? A: No. Medium's official API is very limited and does not support search. This scraper works by parsing the public HTML pages, which provides much richer data.

Pricing

This actor uses Pay Per Event pricing.

Event	Price
Article scraped	$0.003

You only pay for articles that are successfully extracted and saved to your dataset. Failed or skipped articles are not charged.

Integration — Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("sovereigntaylor/medium-scraper").call(run_input={
    "searchTerm": "medium",
    "maxResults": 50
})

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(f"{item.get('title', item.get('name', 'N/A'))}")

Integration — JavaScript

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });

const run = await client.actor('sovereigntaylor/medium-scraper').call({
    searchTerm: 'medium',
    maxResults: 50
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach(item => console.log(item.title || item.name || 'N/A'));

Hacker News Scraper — Scrape HN stories, scores, and comments
Reddit Scraper — Extract Reddit posts and comments from any subreddit
Google Search Scraper — Scrape Google search results for any query
Website to Markdown — Convert any webpage to clean Markdown for AI/RAG

Changelog

v1.0 (2026-03-02)

Initial release
Search, tag, and author scraping
4 extraction strategies (JSON-LD, Apollo, DOM, meta tags)
Member-only detection
Deduplication by URL and title
Pay-per-event billing

License

MIT License. Built by Sovereign AI.

Medium Email Scraper - Advanced, Fast & Cheapest

contacts-api/medium-email-scraper-fast-advanced-and-cheapest

✍️ Medium Email Scraper helps you extract writer and publication emails from Medium profiles ⚡ Ideal for content marketing and PR 📧

Lead Heaven

Medium Extractor

jupri/medium

💫 All-in-One Medium.com Scraper: Search, User, Story, Collection, Catalog

cat

Medium User Posts Scraper

easyapi/medium-user-posts-scraper

Extract detailed post data from Medium user profiles. Get comprehensive information about articles, including engagement metrics, publication details, and content status. Perfect for content analysis, research, and tracking Medium writers' performance. 🔍📊

EasyApi

Medium Publications Search Scraper

easyapi/medium-publications-search-scraper

Scrape Medium publications by keywords - Extract publication details including name, description, URL and avatar from Medium's search results efficiently and reliably.

EasyApi

Medium Scraper

romy/medium-scraper

Medium Scraper is an advanced scraper that allows you to access and extract content from Medium, even for articles that usually require a subscription. This scraper uses bypass techniques to circumvent the "Subscribe now for uninterrupted access" restriction imposed by Medium.

Romy

Medium Email Scraper – Advanced, Cheapest & Reliable 📧⚡

contactminerlabs/medium-email-scraper---advanced-cheapest-reliable

🔍 Scrape Medium Emails Enter your search parameters to collect verified contact emails from public Medium profiles, along with profile title, bio, source URL & platform info ✉️📊 Perfect for lead generation, influencer outreach & data enrichment in tools like Google Sheets or CRMs⚡🧩

ContactMinerLabs

Medium Scraper

qpayre/medium-scraper

Revolutionize your content sourcing with Medium Author Scraper - the powerful Apify actor that effortlessly scrapes all posts / contents from your desired Medium author and returns the data in a structured format, giving you valuable insights in just seconds.

QPS

220

3.0

Medium Profile Scraper - Cheap ✍️🔍

contactminerlabs/medium-profile-scraper---cheap

✍️ This Scraper lets you extract profiles from Medium Perfect for building targeted writer lists, influencer outreach campaigns, partnership discovery & connecting directly with authors, founders & thought leaders Built for content research, trend discovery & publishing intelligence on Medium 📊🔥

ContactMinerLabs

Medium Followers Scraper

easyapi/medium-followers-scraper

Extract Medium user followers data including profile details, membership status, and bio information. Perfect for social media analysis, influencer research, and audience insights.

EasyApi

Medium Publication Scraper Pro

red.cars/medium-publication-scraper

Enterprise-grade Medium data extraction with comprehensive filtering and multi-mode support. Extract publications, authors, articles and trending content without API keys.

AutomateLab

Medium Scraper

Medium Article Scraper

Features

Use Cases

Content Research & Ideation

Trend Analysis & Market Intelligence

Competitive Analysis

Academic & Journalistic Research

SEO & Content Marketing

Data Science & NLP Training

Input Parameters

Output Schema

Field Reference

Code Examples

Python

Python — Export to Pandas DataFrame

JavaScript (Node.js)

JavaScript — Filter Member-Only Articles

Tips & Best Practices

FAQ

Pricing

Integration — Python

Integration — JavaScript

Related Actors

Changelog

v1.0 (2026-03-02)

License

You might also like

Medium Email Scraper - Advanced, Fast & Cheapest

Medium Extractor

Medium User Posts Scraper

Medium Publications Search Scraper

Medium Scraper

Medium Email Scraper – Advanced, Cheapest & Reliable 📧⚡

Medium Scraper

Medium Profile Scraper - Cheap ✍️🔍

Medium Followers Scraper

Medium Publication Scraper Pro