PyPI Package & Dependency Scraper
Pricing
from $3.00 / 1,000 results
PyPI Package & Dependency Scraper
Extract Python package dependency declarations, release cadence, maintainer hints, download stats, and OSV vulnerability summaries from the official PyPI JSON API.
Pricing
from $3.00 / 1,000 results
Rating
0.0
(0)
Developer
naoki anzai
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
4 days ago
Last modified
Categories
Share
PyPI Package Dependency Intelligence
Analyze Python packages from the official PyPI JSON API and export flattened rows for package summaries, release cadence, dependency declarations, maintainer hints, optional download stats, and optional OSV vulnerability signals.
This actor is built for direct package watchlists. It does not scrape PyPI HTML search pages.
Use cases
Use this actor for Python dependency & supply-chain audit — resolve PyPI package metadata and dependency trees. Auth-free, official-API-first, with a stable output schema and documented source compliance.
Inputs
| Field | Default | Notes |
|---|---|---|
packages | required | PyPI package names such as requests, fastapi, django, or numpy. |
includeReleaseHistory | true | Emit release_version rows. |
includeDownloadStats | false | Add recent counts from pypistats.org when available. |
includeVulnerabilities | false | Add OSV advisory summaries. |
maxReleaseRowsPerPackage | 100 | Cap release rows per package. |
maxDependencyRowsPerPackage | 250 | Cap dependency rows per package. |
concurrency | 5 | Parallel package fetches. |
timeoutMs | 15000 | Per-request timeout in ms. |
delivery | dataset | dataset or webhook. |
webhookUrl | empty | Required when delivery=webhook. |
dryRun | false | Skip dataset and webhook delivery. |
Dataset Rows
The dataset is flattened so it can be filtered and joined without unpacking one large object.
package_summary
packageName,normalizedPackageName,requestedName,statusversion,summary,license,requiresPythonauthor,authorEmail,maintainer,maintainerEmailhomePage,sourceUrl,issueTrackerUrl,projectUrlsreleaseCount,firstReleaseAt,latestReleaseAt,releaseCadenceDaysdependencyCount,emittedDependencyCountdownloadLastDay,downloadLastWeek,downloadLastMonthwhen enabledvulnerabilityCount,vulnerabilitieswhen enabledcontactHints,warnings,fetchedAt
release_version
packageName,version,uploadTime,fileCountpackageTypes,yanked,yankReason,fetchedAt
dependency
packageName,packageVersiondependencyName,normalizedDependencyNamedependencyGroupsuch asruntime,conditional, orextra:socksversionSpec,extras,environmentMarkerrawRequirement,parseStatus,fetchedAt
Example Input
{"packages": ["requests", "fastapi"],"includeReleaseHistory": true,"includeDownloadStats": false,"includeVulnerabilities": false,"maxReleaseRowsPerPackage": 25,"maxDependencyRowsPerPackage": 250,"concurrency": 3,"timeoutMs": 15000,"delivery": "dataset","webhookUrl": "","dryRun": false}
Sample output
Each run produces structured dataset rows (see the Dataset Rows section above for the field list). Run the actor once with the example input to see a live sample before scheduling.
Local Development
npm installnpm testnode src/index.js
output/result.json contains the full payload for local inspection. On Apify, dataset delivery writes the flattened rows.
Limitations
- V1 is direct package lookup only; PyPI HTML search is out of scope.
Requires-Distparsing is pragmatic. The original PEP 508 string is always preserved asrawRequirement.- pypistats.org and OSV are optional external enrichments and can return warnings without failing the package.
- OSV output is an advisory summary and should not be treated as a complete vulnerability audit.
Input Examples
Example: Single-target audit
{"targets": ["example-target-1"],"maxResultsPerTarget": 30}
Example: Bulk portfolio
{"targets": ["target-1","target-2","target-3"],"maxResultsPerTarget": 50,"snapshotKey": "pypi-package-dependency-intelligence-state"}
Example: Recurring delta watch
{"targets": ["target-1"],"snapshotKey": "pypi-package-dependency-intelligence-state","emitChangedOnly": true}