PyPI Scraper
Pricing
Pay per event
PyPI Scraper
Extract Python package metadata from PyPI — names, versions, authors, licenses, dependencies, and release history.
Pricing
Pay per event
Rating
0.0
(0)
Developer

Stas Persiianenko
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share
Extract Python package metadata from PyPI — the Python Package Index. Get versions, dependencies, download stats, descriptions, classifiers, and project URLs for any package.
What does PyPI Scraper do?
PyPI Scraper looks up packages on PyPI by name and extracts comprehensive metadata for each one. It fetches data directly from the PyPI JSON API and pypistats.org for download statistics. No browser needed — the actor uses fast HTTP requests for reliable, efficient extraction.
For each package you get: current version, summary, author info, license, Python version requirements, all dependencies, classifiers, keywords, release count, monthly download numbers, and direct URLs.
Why use PyPI Scraper?
- Fast and reliable — uses official PyPI JSON API, no HTML scraping or browser rendering
- Download statistics — includes monthly download counts from pypistats.org
- Complete metadata — versions, dependencies, classifiers, license, Python requirements
- Bulk lookup — process hundreds of packages in a single run
- Structured output — clean JSON ready for analysis, dashboards, or integration
Use cases
- Dependency auditing — check versions, licenses, and Python compatibility across your stack
- Package popularity tracking — monitor download trends for competing libraries
- Supply chain analysis — map dependency trees and identify widely-used packages
- Market research — analyze the Python ecosystem for trends and opportunities
- License compliance — verify licenses across all packages in your organization
- Developer tooling — feed package metadata into internal tools, dashboards, or reports
How to use PyPI Scraper
- Go to the PyPI Scraper input page.
- Add package names to the Package names list.
- Click Start and wait for the run to finish.
- Download your data in JSON, CSV, or Excel format.
Input parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
packageNames | array | Yes | List of PyPI package names to look up (e.g., requests, flask, django) |
Example input
{"packageNames": ["requests", "flask", "django", "numpy", "pandas"]}
Output example
Each package returns a structured object with full metadata:
{"name": "requests","version": "2.32.5","summary": "Python HTTP for Humans.","author": "Kenneth Reitz","authorEmail": "me@kennethreitz.org","license": "Apache-2.0","homePage": "https://requests.readthedocs.io","projectUrl": "https://pypi.org/project/requests/","requiresPython": ">=3.9","dependencies": ["charset_normalizer<4,>=2","idna<4,>=2.5","urllib3<3,>=1.21.1","certifi>=2017.4.17"],"classifiers": ["Development Status :: 5 - Production/Stable","Programming Language :: Python :: 3"],"keywords": [],"releaseCount": 157,"downloadsLastMonth": 1141725900,"packageUrl": "https://pypi.org/project/requests/","scrapedAt": "2026-03-03T05:21:06.098Z"}
Output fields
| Field | Type | Description |
|---|---|---|
name | string | Official package name on PyPI |
version | string | Latest release version |
summary | string | Short package description |
author | string | Package author name |
authorEmail | string | Author contact email |
license | string | License identifier |
homePage | string | Project home page URL |
projectUrl | string | PyPI project page URL |
requiresPython | string | Minimum Python version required |
dependencies | array | List of package dependencies (requires_dist) |
classifiers | array | PyPI trove classifiers |
keywords | array | Package keywords |
releaseCount | number | Total number of releases on PyPI |
downloadsLastMonth | number | Download count in the last 30 days |
packageUrl | string | Direct link to the PyPI project page |
scrapedAt | string | ISO 8601 timestamp of extraction |
Pricing
PyPI Scraper uses pay-per-event pricing:
| Event | Price |
|---|---|
| Run started | $0.001 |
| Package extracted | $0.001 per package |
Cost examples
| Packages | Cost |
|---|---|
| 10 packages | $0.011 |
| 100 packages | $0.101 |
| 1,000 packages | $1.001 |
API usage
Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")run = client.actor("YOUR_USERNAME/pypi-scraper").call(run_input={"packageNames": ["requests", "flask", "django"]})for item in client.dataset(run["defaultDatasetId"]).iterate_items():print(f"{item['name']} v{item['version']} — {item['downloadsLastMonth']:,} downloads/month")
Node.js
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });const run = await client.actor('YOUR_USERNAME/pypi-scraper').call({packageNames: ['requests', 'flask', 'django'],});const { items } = await client.dataset(run.defaultDatasetId).listItems();items.forEach(item => {console.log(`${item.name} v${item.version} — ${item.downloadsLastMonth.toLocaleString()} downloads/month`);});
Integrations
Connect PyPI Scraper to your workflow with Apify integrations:
- Webhooks — trigger actions when a run finishes
- Google Sheets — export package data to spreadsheets automatically
- Slack — get notifications about new package versions
- Zapier / Make — connect to 5,000+ apps and services
- REST API — call the actor programmatically from any language
Tips and best practices
- Use exact package names as they appear on PyPI (e.g.,
scikit-learn, notsklearn) - Download statistics come from pypistats.org and may be slightly delayed
- The actor handles packages that don't exist gracefully — they're skipped with a warning
- For very large batches (1,000+ packages), consider splitting into multiple runs
- Keywords may be empty for many packages — check classifiers for categorization instead
Changelog
- v0.1 — Initial release with package metadata and download stats extraction