🛡️ PyPI Vulnerability Scraper
Pricing
from $8.00 / 1,000 results
🛡️ PyPI Vulnerability Scraper
Extract Python package metadata from PyPI and enrich it with OSV database alerts. Monitor dependencies for new version releases and critical CVE identifiers.
Pricing
from $8.00 / 1,000 results
Rating
0.0
(0)
Developer
太郎 山田
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
13 days ago
Last modified
Categories
Share
PyPI Package Intelligence API | Releases, Dependencies & OSV Signals
Securing your software supply chain requires more than just occasional manual checks; it demands continuous visibility into the code you rely on. This PyPI Vulnerability Scanner automates the tedious process of package due diligence by actively monitoring your Python dependencies for new version releases and known security risks. By querying the official PyPI endpoints and seamlessly enriching that data with the Open Source Vulnerability (OSV) database, this tool functions as an automated early warning system for your entire tech stack.
AppSec engineers and engineering managers use this monitor to catch compromised modules before they hit production environments. You can easily schedule daily or weekly runs to ensure no malicious update or critical CVE slips through the cracks. Instead of building complex internal web scraping tools or manually searching security pages, this scraper efficiently gathers all the necessary details into one structured dataset. Concrete outputs include specific CVE identifiers, affected version ranges, exact dependency declarations, and the latest secure release histories. Whether you are tracking a single critical framework or auditing hundreds of Python packages across multiple microservices, this data empowers you to quickly patch vulnerabilities and maintain rigorous compliance standards. Run the extraction to instantly identify which of your modules require immediate upgrades and keep your infrastructure safe from emerging threats.
Store Quickstart
- Start with 2–5 exact package names in
packages. - Keep
includeDownloadStatsandincludeVulnerabilitiesoff for the fastest first success path, then enable them for shortlisted packages. - Use
dryRun: truewhen you only want to validate the payload shape or delivery settings. - After the first useful run, switch to the recurring watchlist template for repeat package checks, then use the webhook handoff template for release or OSV alerts.
Status
V1 — Live implementation. Scaffolded as part of Wave 6 Batch H; live collection logic implemented.
Data sources
| Source | URL | Notes |
|---|---|---|
| Package metadata | https://pypi.org/pypi/{package}/json | Full metadata, all releases, latest files |
| Download stats (optional) | https://pypistats.org/api/packages/{package}/recent | Third-party; off by default |
| Vulnerability summary (optional) | POST https://api.osv.dev/v1/query | OSV advisory lookup; off by default |
Use Cases
| Who | Why |
|---|---|
| OSS program offices | Audit release cadence, maintainers, and license signals before approving dependencies |
| Security teams | Add optional OSV summaries to triage risky packages faster |
| Developer platform teams | Compare PyPI libraries before standardizing on one package |
| Analysts / investors | Track package maturity and ecosystem traction from public signals |
Input
| Field | Type | Default | Description |
|---|---|---|---|
packages | string[] | — | Required. PyPI package names (e.g. requests). Max 100. |
includeReleaseHistory | boolean | true | Full release version history with upload dates |
includeDownloadStats | boolean | false | Recent download counts from pypistats.org |
includeVulnerabilities | boolean | false | OSV vulnerability advisory summary |
concurrency | integer | 5 | Parallel package fetch limit (1–10) |
timeoutMs | integer | 15000 | Per-request timeout in ms |
delivery | string | dataset | dataset or webhook |
webhookUrl | string | "" | Webhook URL when delivery=webhook |
dryRun | boolean | false | Skip dataset push and webhook delivery |
Output
Each package record contains:
name,requestedName,status,version,summary,descriptionlicense,requiresPython,keywords,classifiers,requiresDistauthor,authorEmail,maintainer,maintainerEmailhomePage,projectUrl,projectUrls,packageUrlreleaseCount,firstRelease,latestReleaselatestFiles— distribution files for the latest version (filename, url, size, sha256, packageType)releaseHistory— all version upload dates (whenincludeReleaseHistory=true)downloadStats— lastDay / lastWeek / lastMonth (whenincludeDownloadStats=true)vulnerabilities— vulnCount + OSV advisory list (whenincludeVulnerabilities=true)warnings— per-package issues (yanked versions, missing fields, enrichment failures)
Output Example
{"name": "requests","status": "ok","version": "2.32.3","license": "Apache-2.0","requiresPython": ">=3.8","releaseCount": 180,"latestRelease": "2024-05-29T00:00:00.000Z","downloadStats": { "lastDay": 1234567, "lastWeek": 8456789, "lastMonth": 34567890 },"vulnerabilities": { "vulnCount": 0, "vulns": [] },"warnings": []}
Status codes
| Status | Meaning |
|---|---|
ok | All requested data fetched successfully |
partial | Metadata fetched but one or more optional enrichments failed |
not_found | Package not found on PyPI (HTTP 404) |
rate_limited | PyPI returned HTTP 429 after retries |
blocked | PyPI returned HTTP 403 |
error | Unexpected network or parse error |
Known limitations
- HTML keyword search (
pypi.org/search/?q=...) is JavaScript-rendered and out of scope. V1 is direct-lookup only. requires_diststrings are raw PEP 508 specifiers; environment markers are not parsed.- Yanked releases are flagged with a per-version warning, not silently skipped.
- pypistats.org is a third-party service; treat download counts as approximate and emit warnings when unavailable.
- OSV vulnerability results are advisory summaries only — not a substitute for a full security audit.
- Package name normalization follows PEP 503 (lowercase, hyphens); a warning is emitted when the canonical name differs from the requested name.
Local run
npm testnpm start
Uses input.json for local testing. Set dryRun: true to skip dataset/webhook delivery.
Related Actors
Pair this actor with other flagship intelligence APIs in the same portfolio:
- NPM Package Intelligence API — compare JavaScript dependency signals with the same normalized metadata approach.
- Docker Hub Image Intelligence API — add container repository context when a package also ships in public images.
- Shopify Store Intelligence API — connect package research to live storefront and catalog signals for platform intelligence.
Pricing & Cost Control
Apify Store pricing is usage-based, so cost mainly follows how many packages you analyze plus any optional enrichments. Check the Store pricing card for the current per-event rates.
- Start with a shortlist of exact
packages. - Keep
includeDownloadStatsandincludeVulnerabilitiesoff for the fastest first pass. - Use
dryRun: truebefore longer shortlists or scheduled runs. - Prefer dataset delivery while you validate downstream mappings.
⭐ Was this helpful?
If this actor saved you time, please leave a ★ rating on Apify Store. It takes 10 seconds, helps other developers discover it, and keeps updates free.
Bug report or feature request? Open an issue on the Issues tab of this actor.