PyPI Scraper
Pricing
Pay per event
PyPI Scraper
Extract Python package metadata from PyPI — names, versions, authors, licenses, dependencies, and release history.
Pricing
Pay per event
Rating
0.0
(0)
Developer

Stas Persiianenko
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 hours ago
Last modified
Categories
Share
Extract Python package metadata from PyPI — the Python Package Index. Get versions, dependencies, download stats, descriptions, classifiers, and project URLs for any package.
What does PyPI Scraper do?
PyPI Scraper looks up packages on PyPI by name and extracts comprehensive metadata for each one. It fetches data directly from the PyPI JSON API and pypistats.org for download statistics. No browser needed — the actor uses fast HTTP requests for reliable, efficient extraction.
For each package you get: current version, summary, author info, license, Python version requirements, all dependencies, classifiers, keywords, release count, monthly download numbers, and direct URLs.
Why use PyPI Scraper?
- Fast and reliable — uses official PyPI JSON API, no HTML scraping or browser rendering
- Download statistics — includes monthly download counts from pypistats.org
- Complete metadata — versions, dependencies, classifiers, license, Python requirements
- Bulk lookup — process hundreds of packages in a single run
- Structured output — clean JSON ready for analysis, dashboards, or integration
Use cases
- Dependency auditing — check versions, licenses, and Python compatibility across your stack
- Package popularity tracking — monitor download trends for competing libraries
- Supply chain analysis — map dependency trees and identify widely-used packages
- Market research — analyze the Python ecosystem for trends and opportunities
- License compliance — verify licenses across all packages in your organization
- Developer tooling — feed package metadata into internal tools, dashboards, or reports
How to scrape PyPI packages
- Go to PyPI Scraper on Apify Store
- Add package names to the Package names list
- Click Start and wait for the run to finish
- Review package metadata, versions, and download stats
- Download your data in JSON, CSV, or Excel format
Input parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
packageNames | array | Yes | List of PyPI package names to look up (e.g., requests, flask, django) |
Example input
{"packageNames": ["requests", "flask", "django", "numpy", "pandas"]}
Output example
Each package returns a structured object with full metadata:
{"name": "requests","version": "2.32.5","summary": "Python HTTP for Humans.","author": "Kenneth Reitz","authorEmail": "me@kennethreitz.org","license": "Apache-2.0","homePage": "https://requests.readthedocs.io","projectUrl": "https://pypi.org/project/requests/","requiresPython": ">=3.9","dependencies": ["charset_normalizer<4,>=2","idna<4,>=2.5","urllib3<3,>=1.21.1","certifi>=2017.4.17"],"classifiers": ["Development Status :: 5 - Production/Stable","Programming Language :: Python :: 3"],"keywords": [],"releaseCount": 157,"downloadsLastMonth": 1141725900,"packageUrl": "https://pypi.org/project/requests/","scrapedAt": "2026-03-03T05:21:06.098Z"}
Output fields
| Field | Type | Description |
|---|---|---|
name | string | Official package name on PyPI |
version | string | Latest release version |
summary | string | Short package description |
author | string | Package author name |
authorEmail | string | Author contact email |
license | string | License identifier |
homePage | string | Project home page URL |
projectUrl | string | PyPI project page URL |
requiresPython | string | Minimum Python version required |
dependencies | array | List of package dependencies (requires_dist) |
classifiers | array | PyPI trove classifiers |
keywords | array | Package keywords |
releaseCount | number | Total number of releases on PyPI |
downloadsLastMonth | number | Download count in the last 30 days |
packageUrl | string | Direct link to the PyPI project page |
scrapedAt | string | ISO 8601 timestamp of extraction |
How much does it cost to scrape PyPI?
PyPI Scraper uses pay-per-event pricing:
| Event | Price |
|---|---|
| Run started | $0.001 |
| Package extracted | $0.001 per package |
Cost examples
| Packages | Cost |
|---|---|
| 10 packages | $0.011 |
| 100 packages | $0.101 |
| 1,000 packages | $1.001 |
API usage
Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")run = client.actor("automation-lab/pypi-scraper").call(run_input={"packageNames": ["requests", "flask", "django"]})for item in client.dataset(run["defaultDatasetId"]).iterate_items():print(f"{item['name']} v{item['version']} — {item['downloadsLastMonth']:,} downloads/month")
Node.js
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });const run = await client.actor('automation-lab/pypi-scraper').call({packageNames: ['requests', 'flask', 'django'],});const { items } = await client.dataset(run.defaultDatasetId).listItems();items.forEach(item => {console.log(`${item.name} v${item.version} — ${item.downloadsLastMonth.toLocaleString()} downloads/month`);});
Integrations
Connect PyPI Scraper to your workflow with Apify integrations:
- Webhooks — trigger actions when a run finishes
- Google Sheets — export package data to spreadsheets automatically
- Slack — get notifications about new package versions
- Zapier / Make — connect to 5,000+ apps and services
- REST API — call the actor programmatically from any language
Tips and best practices
- Use exact package names as they appear on PyPI (e.g.,
scikit-learn, notsklearn) - Download statistics come from pypistats.org and may be slightly delayed
- The actor handles packages that don't exist gracefully — they're skipped with a warning
- For very large batches (1,000+ packages), consider splitting into multiple runs
- Keywords may be empty for many packages — check classifiers for categorization instead
Use PyPI Scraper with Claude AI (MCP)
You can integrate PyPI Scraper as a tool in Claude AI or any MCP-compatible client. This lets you ask Claude to fetch PyPI data in natural language.
Setup
CLI:
$claude mcp add pypi-scraper -- npx -y @anthropic-ai/apify-mcp-server@latest --actors=automation-lab/pypi-scraper
JSON config (Claude Desktop, Cline, etc.):
{"mcpServers": {"pypi-scraper": {"command": "npx","args": ["-y", "@anthropic-ai/apify-mcp-server@latest", "--actors=automation-lab/pypi-scraper"]}}}
Set your APIFY_TOKEN as an environment variable or pass it via --token.
Example prompts
- "Search PyPI for web scraping packages"
- "Get metadata for these Python packages"
- "Compare download stats for requests, httpx, and aiohttp"
cURL
curl "https://api.apify.com/v2/acts/automation-lab~pypi-scraper/run-sync-get-dataset-items?token=YOUR_API_TOKEN" \-X POST -H "Content-Type: application/json" \-d '{"packageNames": ["requests", "flask", "django"]}'
A package I entered was skipped with no output.
The scraper skips packages that don't exist on PyPI. Double-check the exact package name — PyPI names are case-insensitive but must match (e.g., scikit-learn, not sklearn).
Download counts seem unusually high or low. Download stats come from pypistats.org and include CI/CD pipeline installs, mirrors, and automated downloads. They may be slightly delayed (up to 24 hours).
Other developer tools
- Pub.dev Scraper — scrape Dart and Flutter package metadata from pub.dev
- npm Scraper — scrape npm package metadata and download stats
- Crates Scraper — scrape Rust crate metadata from crates.io
- Homebrew Scraper — scrape Homebrew formula metadata
- Docker Hub Scraper — scrape Docker image metadata from Docker Hub
- PPE Cost Estimator — estimate per-result costs for any Apify actor
- Apify Store Analyzer — analyze actors and trends on the Apify Store
Changelog
- v0.1 — Initial release with package metadata and download stats extraction