PyPI Scraper avatar

PyPI Scraper

Pricing

Pay per event

Go to Apify Store
PyPI Scraper

PyPI Scraper

Extract Python package metadata from PyPI — names, versions, authors, licenses, dependencies, and release history.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Stas Persiianenko

Stas Persiianenko

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a day ago

Last modified

Categories

Share

Extract Python package metadata from PyPI — the Python Package Index. Get versions, dependencies, download stats, descriptions, classifiers, and project URLs for any package.

What does PyPI Scraper do?

PyPI Scraper looks up packages on PyPI by name and extracts comprehensive metadata for each one. It fetches data directly from the PyPI JSON API and pypistats.org for download statistics. No browser needed — the actor uses fast HTTP requests for reliable, efficient extraction.

For each package you get: current version, summary, author info, license, Python version requirements, all dependencies, classifiers, keywords, release count, monthly download numbers, and direct URLs.

Why use PyPI Scraper?

  • Fast and reliable — uses official PyPI JSON API, no HTML scraping or browser rendering
  • Download statistics — includes monthly download counts from pypistats.org
  • Complete metadata — versions, dependencies, classifiers, license, Python requirements
  • Bulk lookup — process hundreds of packages in a single run
  • Structured output — clean JSON ready for analysis, dashboards, or integration

Use cases

  • Dependency auditing — check versions, licenses, and Python compatibility across your stack
  • Package popularity tracking — monitor download trends for competing libraries
  • Supply chain analysis — map dependency trees and identify widely-used packages
  • Market research — analyze the Python ecosystem for trends and opportunities
  • License compliance — verify licenses across all packages in your organization
  • Developer tooling — feed package metadata into internal tools, dashboards, or reports

How to use PyPI Scraper

  1. Go to the PyPI Scraper input page.
  2. Add package names to the Package names list.
  3. Click Start and wait for the run to finish.
  4. Download your data in JSON, CSV, or Excel format.

Input parameters

ParameterTypeRequiredDescription
packageNamesarrayYesList of PyPI package names to look up (e.g., requests, flask, django)

Example input

{
"packageNames": ["requests", "flask", "django", "numpy", "pandas"]
}

Output example

Each package returns a structured object with full metadata:

{
"name": "requests",
"version": "2.32.5",
"summary": "Python HTTP for Humans.",
"author": "Kenneth Reitz",
"authorEmail": "me@kennethreitz.org",
"license": "Apache-2.0",
"homePage": "https://requests.readthedocs.io",
"projectUrl": "https://pypi.org/project/requests/",
"requiresPython": ">=3.9",
"dependencies": [
"charset_normalizer<4,>=2",
"idna<4,>=2.5",
"urllib3<3,>=1.21.1",
"certifi>=2017.4.17"
],
"classifiers": [
"Development Status :: 5 - Production/Stable",
"Programming Language :: Python :: 3"
],
"keywords": [],
"releaseCount": 157,
"downloadsLastMonth": 1141725900,
"packageUrl": "https://pypi.org/project/requests/",
"scrapedAt": "2026-03-03T05:21:06.098Z"
}

Output fields

FieldTypeDescription
namestringOfficial package name on PyPI
versionstringLatest release version
summarystringShort package description
authorstringPackage author name
authorEmailstringAuthor contact email
licensestringLicense identifier
homePagestringProject home page URL
projectUrlstringPyPI project page URL
requiresPythonstringMinimum Python version required
dependenciesarrayList of package dependencies (requires_dist)
classifiersarrayPyPI trove classifiers
keywordsarrayPackage keywords
releaseCountnumberTotal number of releases on PyPI
downloadsLastMonthnumberDownload count in the last 30 days
packageUrlstringDirect link to the PyPI project page
scrapedAtstringISO 8601 timestamp of extraction

Pricing

PyPI Scraper uses pay-per-event pricing:

EventPrice
Run started$0.001
Package extracted$0.001 per package

Cost examples

PackagesCost
10 packages$0.011
100 packages$0.101
1,000 packages$1.001

API usage

Python

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("YOUR_USERNAME/pypi-scraper").call(
run_input={"packageNames": ["requests", "flask", "django"]}
)
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"{item['name']} v{item['version']}{item['downloadsLastMonth']:,} downloads/month")

Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('YOUR_USERNAME/pypi-scraper').call({
packageNames: ['requests', 'flask', 'django'],
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach(item => {
console.log(`${item.name} v${item.version}${item.downloadsLastMonth.toLocaleString()} downloads/month`);
});

Integrations

Connect PyPI Scraper to your workflow with Apify integrations:

  • Webhooks — trigger actions when a run finishes
  • Google Sheets — export package data to spreadsheets automatically
  • Slack — get notifications about new package versions
  • Zapier / Make — connect to 5,000+ apps and services
  • REST API — call the actor programmatically from any language

Tips and best practices

  • Use exact package names as they appear on PyPI (e.g., scikit-learn, not sklearn)
  • Download statistics come from pypistats.org and may be slightly delayed
  • The actor handles packages that don't exist gracefully — they're skipped with a warning
  • For very large batches (1,000+ packages), consider splitting into multiple runs
  • Keywords may be empty for many packages — check classifiers for categorization instead

Changelog

  • v0.1 — Initial release with package metadata and download stats extraction