# Medium Article Scraper (`crawlerbros/medium-scraper`) Actor

Scrape Medium articles by tag/topic, user, publication, or search query. Extracts title, author, tags, preview text, reading time, publish date, and paywall status all via public RSS feeds and metadata.

- **URL**: https://apify.com/crawlerbros/medium-scraper.md
- **Developed by:** [Crawler Bros](https://apify.com/crawlerbros) (community)
- **Categories:** News, Developer tools, Other
- **Stats:** 1 total users, 0 monthly users, 100.0% runs succeeded, 11 bookmarks
- **User rating**: 5.00 out of 5 stars

## Pricing

from $5.00 / 1,000 results

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Medium Article Scraper

Scrape articles from [Medium](https://medium.com) by tag/topic, author, publication, keyword search, or a single article URL — all without authentication.

### Features

- **By Tag** — fetch latest articles for any Medium topic (programming, data-science, AI, and 16+ built-in tags, or any custom slug)
- **By User** — fetch all articles published by a Medium username
- **By Publication** — fetch articles from any Medium publication (e.g. `better-programming`, `towards-data-science`)
- **Search** — search Medium by keyword; supplements with tag-feed results for broader coverage
- **By URL** — extract metadata from a single Medium article URL
- Filter results by publish date, keyword match, and paywall status
- Detects member-only (paywalled) articles — never bypasses the paywall

### Output Fields

| Field | Type | Description |
|-------|------|-------------|
| `articleId` | string | Unique Medium article ID extracted from URL |
| `title` | string | Article title |
| `previewText` | string | First 300 chars of article preview/description |
| `authorName` | string | Author's display name |
| `authorUsername` | string | Medium username (without @) |
| `publicationName` | string | Publication name (if published under one) |
| `publicationSlug` | string | Publication URL slug |
| `publishedDate` | string | ISO-8601 publish date |
| `tags` | array | Article tags/categories |
| `readingTimeMinutes` | integer | Estimated reading time in minutes |
| `articleUrl` | string | Full article URL |
| `canonicalUrl` | string | Canonical URL (if different from articleUrl) |
| `isPaywalled` | boolean | Whether the article is member-only |
| `recordType` | string | Always `"article"` |
| `siteName` | string | Always `"Medium"` |
| `scrapedAt` | string | ISO-8601 scrape timestamp |

### Input Options

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `mode` | select | `byTag` | What to scrape: `byTag`, `byUser`, `byPublication`, `search`, `byUrl` |
| `tag` | select | `programming` | Tag slug for `byTag` mode (choose from 20 built-in tags or type custom) |
| `username` | string | — | Medium username for `byUser` mode |
| `publication` | string | — | Publication slug for `byPublication` mode |
| `query` | string | — | Search terms for `search` mode |
| `articleUrl` | string | — | Article URL for `byUrl` mode |
| `fromDate` | string | — | Only articles published on/after this date (ISO-8601) |
| `containsKeyword` | string | — | Case-insensitive keyword filter on title/preview |
| `excludePaywalled` | boolean | `false` | Skip member-only articles |
| `maxItems` | integer | `20` | Maximum articles to return (1–1000) |

### Example Use Cases

**Get latest programming articles:**
```json
{"mode": "byTag", "tag": "programming", "maxItems": 20}
````

**Get articles by a specific author:**

```json
{"mode": "byUser", "username": "towardsdatascience", "maxItems": 50}
```

**Get free articles about Python published in 2025:**

```json
{"mode": "byTag", "tag": "python", "fromDate": "2025-01-01", "excludePaywalled": true}
```

**Search for AI articles:**

```json
{"mode": "search", "query": "artificial intelligence", "maxItems": 30}
```

### FAQs

**Does this scrape paywalled article content?**
No. This actor only collects publicly available metadata (title, author, tags, preview text). Full article content for member-only posts is never extracted. Use `isPaywalled` to identify and filter such articles.

**How many articles are returned per tag/user?**
Medium RSS feeds typically return the latest 10–25 items. For more results, use the `search` mode which supplements with tag feed data.

**Why does search mode return fewer results than expected?**
Medium's public search does not expose a paginated JSON API. The actor uses HTML and tag feed supplementation to maximize results within Medium's public data surface.

**Is a Medium account required?**
No authentication or cookies are required.

**What tags are available in the dropdown?**
20 popular tags are in the dropdown (programming, data-science, AI, python, javascript, startup, etc.). You can also type any custom tag slug directly.

# Actor input Schema

## `mode` (type: `string`):

What to scrape from Medium.

## `tag` (type: `string`):

Tag slug to fetch articles for. Choose from the dropdown or type a custom slug (e.g. `python`, `startups`, `ux`).

## `username` (type: `string`):

Medium username without the @ symbol (e.g. `towardsdatascience`).

## `publication` (type: `string`):

Medium publication slug as it appears in the URL (e.g. `better-programming`, `towards-data-science`).

## `query` (type: `string`):

Keywords to search for on Medium.

## `articleUrl` (type: `string`):

Full URL of a single Medium article (e.g. `https://medium.com/@username/article-title-abc123`).

## `fromDate` (type: `string`):

Only include articles published on or after this date (e.g. `2024-01-01`). Applied after fetching.

## `containsKeyword` (type: `string`):

Only include articles whose title or preview text contains this keyword (case-insensitive).

## `excludePaywalled` (type: `boolean`):

If true, skip articles that appear to be member-only (paywalled).

## `maxItems` (type: `integer`):

Maximum number of articles to return.

## Actor input object example

```json
{
  "mode": "byTag",
  "tag": "programming",
  "query": "artificial intelligence",
  "excludePaywalled": false,
  "maxItems": 5
}
```

# Actor output Schema

## `records` (type: `string`):

Dataset containing all scraped Medium article records.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "mode": "byTag",
    "tag": "programming",
    "query": "artificial intelligence",
    "excludePaywalled": false,
    "maxItems": 5
};

// Run the Actor and wait for it to finish
const run = await client.actor("crawlerbros/medium-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "mode": "byTag",
    "tag": "programming",
    "query": "artificial intelligence",
    "excludePaywalled": False,
    "maxItems": 5,
}

# Run the Actor and wait for it to finish
run = client.actor("crawlerbros/medium-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "mode": "byTag",
  "tag": "programming",
  "query": "artificial intelligence",
  "excludePaywalled": false,
  "maxItems": 5
}' |
apify call crawlerbros/medium-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=crawlerbros/medium-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Medium Article Scraper",
        "description": "Scrape Medium articles by tag/topic, user, publication, or search query. Extracts title, author, tags, preview text, reading time, publish date, and paywall status all via public RSS feeds and metadata.",
        "version": "1.0",
        "x-build-id": "nhegtXdwxmHJvECfv"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/crawlerbros~medium-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-crawlerbros-medium-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/crawlerbros~medium-scraper/runs": {
            "post": {
                "operationId": "runs-sync-crawlerbros-medium-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/crawlerbros~medium-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-crawlerbros-medium-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "mode"
                ],
                "properties": {
                    "mode": {
                        "title": "Mode",
                        "enum": [
                            "byTag",
                            "byUser",
                            "byPublication",
                            "search",
                            "byUrl"
                        ],
                        "type": "string",
                        "description": "What to scrape from Medium.",
                        "default": "byTag"
                    },
                    "tag": {
                        "title": "Tag / Topic (mode=byTag)",
                        "enum": [
                            "programming",
                            "data-science",
                            "machine-learning",
                            "artificial-intelligence",
                            "python",
                            "javascript",
                            "startup",
                            "entrepreneurship",
                            "productivity",
                            "design",
                            "health",
                            "finance",
                            "technology",
                            "writing",
                            "science",
                            "self-improvement",
                            "business",
                            "education",
                            "travel",
                            "food"
                        ],
                        "type": "string",
                        "description": "Tag slug to fetch articles for. Choose from the dropdown or type a custom slug (e.g. `python`, `startups`, `ux`).",
                        "default": "programming"
                    },
                    "username": {
                        "title": "Medium Username (mode=byUser)",
                        "type": "string",
                        "description": "Medium username without the @ symbol (e.g. `towardsdatascience`)."
                    },
                    "publication": {
                        "title": "Publication Slug (mode=byPublication)",
                        "type": "string",
                        "description": "Medium publication slug as it appears in the URL (e.g. `better-programming`, `towards-data-science`)."
                    },
                    "query": {
                        "title": "Search Query (mode=search)",
                        "type": "string",
                        "description": "Keywords to search for on Medium.",
                        "default": "artificial intelligence"
                    },
                    "articleUrl": {
                        "title": "Article URL (mode=byUrl)",
                        "type": "string",
                        "description": "Full URL of a single Medium article (e.g. `https://medium.com/@username/article-title-abc123`)."
                    },
                    "fromDate": {
                        "title": "Published After (ISO-8601 date)",
                        "type": "string",
                        "description": "Only include articles published on or after this date (e.g. `2024-01-01`). Applied after fetching."
                    },
                    "containsKeyword": {
                        "title": "Contains Keyword",
                        "type": "string",
                        "description": "Only include articles whose title or preview text contains this keyword (case-insensitive)."
                    },
                    "excludePaywalled": {
                        "title": "Exclude Paywalled Articles",
                        "type": "boolean",
                        "description": "If true, skip articles that appear to be member-only (paywalled).",
                        "default": false
                    },
                    "maxItems": {
                        "title": "Max Items",
                        "minimum": 1,
                        "maximum": 1000,
                        "type": "integer",
                        "description": "Maximum number of articles to return.",
                        "default": 5
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
