Pricing

from $3.00 / 1,000 results

🐍 PyPI Scraper — Python Package Data

Extract Python package data from PyPI — download stats, dependencies, version history & maintainers. Build Python ecosystem analytics, dependency audits & monitoring dashboards. Pay per package.

Pricing

from $3.00 / 1,000 results

Rating

0.0

(0)

Developer

Stephan Corbeil

Actor stats

Bookmarked

Total users

Monthly active users

a day ago

Last modified

🐍 PyPI Package Scraper — Python Package Metadata, Versions & Downloads

Bulk-extract Python package metadata from the PyPI registry: name, summary, latest version, version history, dependencies (with extras), classifiers, project URLs, license, maintainers, and download stats from the BigQuery public dataset. A pay-per-result alternative to libraries.io API, Snyk Advisor, PyPiStats, and OSS Insight — designed for Python tooling founders, ML-platform teams sizing model adoption, security teams auditing supply chain, and DevRel measuring library traction.

Why PyPI Package Scraper Beats libraries.io, Snyk Advisor, PyPiStats & OSS Insight

Feature	NexGenData PyPI Scraper	libraries.io API	Snyk Advisor	PyPiStats	OSS Insight
Cost	$2 per 1K packages, pay-per-event	Free (heavy throttle)	$25-99 / dev / month	Free (HTML only)	$$ / month
Version history with dates	Yes	Yes	Limited	No	Limited
Classifiers + extras	Yes	Partial	No	No	No
Download counts (monthly)	Yes — via BQ public dataset	Yes	No	Yes	Yes
Project URLs (homepage / repo / docs)	Yes	Yes	Yes	No	Yes
Bulk export	JSON / CSV / Excel	Plan-gated	CSV	HTML scrape only	Plan-gated
Auth	Apify token	API key	Snyk account	None	OSS Insight account
Monthly minimum	None	None	$25+	None	Plan-based

Most Python-ecosystem teams pick this actor instead of hand-rolling a PyPI JSON-API harvester because it is a drop-in alternative to libraries.io with no rate-limit pain, cheaper than Snyk Advisor for non-vuln workflows, and packages download stats together with metadata in a single dataset row — saving you a separate PyPiStats lookup per package.

What You Get Per Package

Each dataset item is a flat record:

name, summary, description
latest_version, latest_release_date
versions[] — every release with {version, published_at, yanked}
requires_dist[] — declared dependencies including extras
requires_python — version constraint
classifiers[] — Trove classifiers (e.g. "Programming Language :: Python :: 3.12")
project_urls — {Homepage, Documentation, Source, Bug Tracker, ...}
home_page, license
author, author_email, maintainer, maintainer_email
keywords
downloads — {last_day, last_week, last_month}
wheel_count, sdist_available
vulnerabilities[] — known PyPI advisories if any

Use Cases

ML-platform founders — measure the adoption of mlflow vs wandb vs clearml by sorting on downloads.last_month
Internal-tools teams — audit a private requirements.txt against PyPI to flag yanked, abandoned, or vulnerable packages
Data-engineering managers — generate a license-compliance report (license field) across every package the team uses
VC analysts — find breakout Python data tools by ranking on month-over-month download growth
Open-source maintainers — track downstream requires_dist references to your library to size the ecosystem
Security teams — pull vulnerabilities[] for every package in a Software Bill of Materials (SBOM)

Quick Start

from apify_client import ApifyClient
client = ApifyClient("YOUR_APIFY_TOKEN")
run = client.actor("nexgendata/pypi-scraper").call(run_input={
    "packages": ["requests", "pandas", "fastapi", "pydantic", "polars"]
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item["name"], item["latest_version"], item["downloads"]["last_month"])

Pricing

Pay-per-event:

Actor Start: small fixed charge per run (memory-scaled)
Per package: $2 per 1,000 packages returned

No subscription, no minimum.

Use case	Actor
npm package metadata scraper	npm-scraper
PyPI package download statistics	pypi-package-stats
npm package download statistics	npm-package-stats
GitHub trending repositories	github-trending-repos
GitHub repository deep-stats	github-repo-stats
Stack Overflow Q&A scraper	stackoverflow-questions
Developer tools intelligence MCP	developer-tools-mcp-server
Hacker News scraper	hacker-news-scraper

FAQ

How are download counts calculated? PyPI publishes anonymized download logs to a BigQuery public dataset. We aggregate the last 1 / 7 / 30 days per package, which matches pypistats exactly.

Are pre-release versions included? Yes — the full versions[] array includes alphas, betas, and release candidates. Filter on the suffix in your code if you only want stable.

Does the actor follow PyPI Warehouse JSON API? Yes, it uses the official pypi.org/pypi/<pkg>/json endpoint plus the simple-index for completeness.

Output formats? JSON, CSV, Excel, and the Apify dataset API.

Is this legal? Yes — PyPI metadata is public.

About NexGenData

NexGenData publishes 260+ buyer-intent actors covering SEC filings, YC alumni, lead generation, competitive intelligence, stock fundamentals across 30+ exchanges, and more. All pay-per-result. Browse the full catalog at https://apify.com/nexgendata?fpr=2ayu9b

How NexGenData Pricing Works

Every NexGenData actor uses pay-per-event pricing — you only pay for results that actually land in your dataset. No monthly minimum, no seat fees, no surprise overage bills.

Actor Start: a single-event charge each time you spin the actor up (scaled to memory size)
Result / item: charged per item written to the default dataset
No charge for retries, internal proxy rotation, or failed sub-requests — those are absorbed by the platform

Apify Platform Bonus

New to Apify? Sign up with the NexGenData referral link — you get free platform credits on signup (enough for several thousand free results) and you help fund the maintenance of this actor fleet.

Integration Surface

Every actor in the NexGenData catalog can be triggered from:

Apify console — point-and-click run
Apify API — REST + webhooks
Apify Python / JS SDKs — programmatic batch
Zapier, Make.com, n8n — official integrations
MCP — many actors are exposed as MCP tools for Claude / ChatGPT / Cursor agents
Schedules — built-in cron for daily / weekly / monthly runs
Webhooks — POST results to any HTTPS endpoint on dataset write

Support

NexGenData maintains 260+ Apify actors and ships updates regularly. Bug reports via the Apify console issues tab get a response within 24 hours. Roadmap requests are welcome — high-demand features ship in the next version.

Home: thenextgennexus.com Full catalog: apify.com/nexgendata

Pypi Package Scraper

openclawmara/pypi-package-scraper

Scrape PyPI the Python Package Index. Extract package metadata, download statistics, version history, dependencies, and maintainer info. Track new releases and popularity trends. Perfect for Python ecosystem analysis and package research.

OpenClaw Mara

PyPI Python Package Scraper

cloud9_ai/pypi-package-scraper

Search and extract Python package data from PyPI. Get versions, dependencies, download stats, and classifiers. No API key needed.

cloud9

PyPI Package Metadata Scraper

klondikeking/pypi-package-scraper

Extract detailed metadata for Python packages from the PyPI public API. Get version, dependencies, classifiers, license, author info, and release history for any PyPI package.

Pierrick McD0nald

PyPI Scraper

automation-lab/pypi-scraper

Extract Python package metadata from PyPI — names, versions, authors, licenses, dependencies, and release history.

Stas Persiianenko

PyPI Package Dependency Intelligence

taroyamada/pypi-package-dependency-intelligence

Extract Python package dependency declarations, release cadence, maintainer hints, download stats, and OSV vulnerability summaries from the official PyPI JSON API.

太郎山田

📦 npm Scraper — Downloads & Dependencies

nexgendata/npm-scraper

Extract npm package data — download counts, dependencies, version history, maintainers & READMEs. Build dependency analysis, package monitoring & JS ecosystem trackers. Pay per package.

Stephan Corbeil

PyPI Packages Scraper

parseforge/pypi-packages-scraper

Pull Python package data from PyPI. Returns name, version, summary, description, classifiers, license, author, project URLs (homepage, source, issues, docs), Python version requirement, dependencies, release history, last upload, and total release count. Direct lookup by package name.

ParseForge

PyPI (Python) Packages Scraper

gio21/pypi-packages-scraper

Scrape PyPI Python packages by name-search or top-downloads. Returns full metadata: name, version, summary, author, license, downloads, dependencies, project URLs, classifiers. Pay per package returned.

Gio

🛡️ PyPI Vulnerability Scraper

taroyamada/pypi-package-intelligence

Extract Python package metadata from PyPI and enrich it with OSV database alerts. Monitor dependencies for new version releases and critical CVE identifiers.

太郎山田

Npm Package Scraper

openclawmara/npm-package-scraper

Scrape npm the JavaScript package registry. Search packages, extract metadata, download statistics, dependencies, and version history. Track package popularity trends. Essential for JavaScript and Node.js ecosystem research and dependency analysis.

OpenClaw Mara