🛡️ PyPI Vulnerability Scraper avatar

🛡️ PyPI Vulnerability Scraper

Pricing

from $8.00 / 1,000 results

Go to Apify Store
🛡️ PyPI Vulnerability Scraper

🛡️ PyPI Vulnerability Scraper

Extract Python package metadata from PyPI and enrich it with OSV database alerts. Monitor dependencies for new version releases and critical CVE identifiers.

Pricing

from $8.00 / 1,000 results

Rating

0.0

(0)

Developer

太郎 山田

太郎 山田

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

13 days ago

Last modified

Categories

Share

PyPI Package Intelligence API | Releases, Dependencies & OSV Signals

Securing your software supply chain requires more than just occasional manual checks; it demands continuous visibility into the code you rely on. This PyPI Vulnerability Scanner automates the tedious process of package due diligence by actively monitoring your Python dependencies for new version releases and known security risks. By querying the official PyPI endpoints and seamlessly enriching that data with the Open Source Vulnerability (OSV) database, this tool functions as an automated early warning system for your entire tech stack.

AppSec engineers and engineering managers use this monitor to catch compromised modules before they hit production environments. You can easily schedule daily or weekly runs to ensure no malicious update or critical CVE slips through the cracks. Instead of building complex internal web scraping tools or manually searching security pages, this scraper efficiently gathers all the necessary details into one structured dataset. Concrete outputs include specific CVE identifiers, affected version ranges, exact dependency declarations, and the latest secure release histories. Whether you are tracking a single critical framework or auditing hundreds of Python packages across multiple microservices, this data empowers you to quickly patch vulnerabilities and maintain rigorous compliance standards. Run the extraction to instantly identify which of your modules require immediate upgrades and keep your infrastructure safe from emerging threats.

Store Quickstart

  • Start with 2–5 exact package names in packages.
  • Keep includeDownloadStats and includeVulnerabilities off for the fastest first success path, then enable them for shortlisted packages.
  • Use dryRun: true when you only want to validate the payload shape or delivery settings.
  • After the first useful run, switch to the recurring watchlist template for repeat package checks, then use the webhook handoff template for release or OSV alerts.

Status

V1 — Live implementation. Scaffolded as part of Wave 6 Batch H; live collection logic implemented.

Data sources

SourceURLNotes
Package metadatahttps://pypi.org/pypi/{package}/jsonFull metadata, all releases, latest files
Download stats (optional)https://pypistats.org/api/packages/{package}/recentThird-party; off by default
Vulnerability summary (optional)POST https://api.osv.dev/v1/queryOSV advisory lookup; off by default

Use Cases

WhoWhy
OSS program officesAudit release cadence, maintainers, and license signals before approving dependencies
Security teamsAdd optional OSV summaries to triage risky packages faster
Developer platform teamsCompare PyPI libraries before standardizing on one package
Analysts / investorsTrack package maturity and ecosystem traction from public signals

Input

FieldTypeDefaultDescription
packagesstring[]Required. PyPI package names (e.g. requests). Max 100.
includeReleaseHistorybooleantrueFull release version history with upload dates
includeDownloadStatsbooleanfalseRecent download counts from pypistats.org
includeVulnerabilitiesbooleanfalseOSV vulnerability advisory summary
concurrencyinteger5Parallel package fetch limit (1–10)
timeoutMsinteger15000Per-request timeout in ms
deliverystringdatasetdataset or webhook
webhookUrlstring""Webhook URL when delivery=webhook
dryRunbooleanfalseSkip dataset push and webhook delivery

Output

Each package record contains:

  • name, requestedName, status, version, summary, description
  • license, requiresPython, keywords, classifiers, requiresDist
  • author, authorEmail, maintainer, maintainerEmail
  • homePage, projectUrl, projectUrls, packageUrl
  • releaseCount, firstRelease, latestRelease
  • latestFiles — distribution files for the latest version (filename, url, size, sha256, packageType)
  • releaseHistory — all version upload dates (when includeReleaseHistory=true)
  • downloadStats — lastDay / lastWeek / lastMonth (when includeDownloadStats=true)
  • vulnerabilities — vulnCount + OSV advisory list (when includeVulnerabilities=true)
  • warnings — per-package issues (yanked versions, missing fields, enrichment failures)

Output Example

{
"name": "requests",
"status": "ok",
"version": "2.32.3",
"license": "Apache-2.0",
"requiresPython": ">=3.8",
"releaseCount": 180,
"latestRelease": "2024-05-29T00:00:00.000Z",
"downloadStats": { "lastDay": 1234567, "lastWeek": 8456789, "lastMonth": 34567890 },
"vulnerabilities": { "vulnCount": 0, "vulns": [] },
"warnings": []
}

Status codes

StatusMeaning
okAll requested data fetched successfully
partialMetadata fetched but one or more optional enrichments failed
not_foundPackage not found on PyPI (HTTP 404)
rate_limitedPyPI returned HTTP 429 after retries
blockedPyPI returned HTTP 403
errorUnexpected network or parse error

Known limitations

  • HTML keyword search (pypi.org/search/?q=...) is JavaScript-rendered and out of scope. V1 is direct-lookup only.
  • requires_dist strings are raw PEP 508 specifiers; environment markers are not parsed.
  • Yanked releases are flagged with a per-version warning, not silently skipped.
  • pypistats.org is a third-party service; treat download counts as approximate and emit warnings when unavailable.
  • OSV vulnerability results are advisory summaries only — not a substitute for a full security audit.
  • Package name normalization follows PEP 503 (lowercase, hyphens); a warning is emitted when the canonical name differs from the requested name.

Local run

npm test
npm start

Uses input.json for local testing. Set dryRun: true to skip dataset/webhook delivery.

Pair this actor with other flagship intelligence APIs in the same portfolio:

Pricing & Cost Control

Apify Store pricing is usage-based, so cost mainly follows how many packages you analyze plus any optional enrichments. Check the Store pricing card for the current per-event rates.

  • Start with a shortlist of exact packages.
  • Keep includeDownloadStats and includeVulnerabilities off for the fastest first pass.
  • Use dryRun: true before longer shortlists or scheduled runs.
  • Prefer dataset delivery while you validate downstream mappings.

⭐ Was this helpful?

If this actor saved you time, please leave a ★ rating on Apify Store. It takes 10 seconds, helps other developers discover it, and keeps updates free.

Bug report or feature request? Open an issue on the Issues tab of this actor.