PyPI Package Metadata Scraper avatar

PyPI Package Metadata Scraper

Pricing

$1.00 / 1,000 package extracteds

Go to Apify Store
PyPI Package Metadata Scraper

PyPI Package Metadata Scraper

Extract detailed metadata for Python packages from the PyPI public API. Get version, dependencies, classifiers, license, author info, and release history for any PyPI package.

Pricing

$1.00 / 1,000 package extracteds

Rating

0.0

(0)

Developer

Pierrick McD0nald

Pierrick McD0nald

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

6 days ago

Last modified

Share

PyPI Package Metadata Scraper — Extract Python Package Data from PyPI

The PyPI Package Metadata Scraper extracts comprehensive metadata for any Python package hosted on the Python Package Index (PyPI). Using the official PyPI JSON API, this Actor retrieves version information, dependencies, classifiers, license details, author information, and release history in a structured format ready for analysis.

Use Cases

  • Dependency Analysis — Audit the dependency trees and Python version requirements of packages in your stack
  • License Compliance — Batch-check licenses for all packages used in a project to ensure compliance
  • Package Research — Compare package maturity, release frequency, and maintainer activity across candidates
  • Security Auditing — Identify outdated packages and track release history for vulnerability assessment
  • Market Intelligence — Build datasets of Python ecosystem trends, popular frameworks, and emerging libraries

Input

FieldTypeRequiredDescription
packageNamesArrayYesList of PyPI package names to look up (e.g., requests, flask, django)
includeReleasesBooleanNoInclude full release history with upload dates and file URLs (default: false)
proxyConfigurationObjectNoProxy configuration for requests

Output

The Actor outputs a dataset with the following fields for each package:

{
"name": "requests",
"version": "2.33.1",
"summary": "Python HTTP for Humans.",
"description": "# Requests...",
"author": "Kenneth Reitz",
"authorEmail": "me@kennethreitz.org",
"maintainer": "Ian Stapleton Cordasco",
"maintainerEmail": "graffatcolmingov@gmail.com",
"license": "Apache-2.0",
"homePage": "",
"projectUrl": "https://pypi.org/project/requests/",
"requiresPython": ">=3.10",
"requiresDist": ["charset_normalizer<4,>=2", "idna<4,>=2.5", "urllib3<3,>=1.26"],
"classifiers": ["Development Status :: 5 - Production/Stable", ...],
"keywords": "",
"platform": "",
"uploadTime": "2026-03-30T16:09:15.531306Z",
"releaseCount": 160,
"latestReleaseUrl": "https://pypi.org/project/requests/2.33.1/",
"sourceUrl": "https://github.com/psf/requests"
}

Pricing

Pay per event: $0.001 USD per package extracted

Limitations

  • PyPI API returns 404 for packages that do not exist — these are logged as failed
  • Release history can be large for mature packages; use includeReleases sparingly for packages with 100+ releases
  • The PyPI JSON API does not provide download statistics (all download fields return -1)
  • Rate limiting is handled with automatic retries (3 attempts with backoff)

FAQ

Q: Can I search for packages by keyword? A: No — this Actor requires exact package names. Use packageNames with known PyPI package identifiers.

Q: Does this Actor require authentication? A: No. The PyPI JSON API is public and requires no API key.

Q: What happens if a package is not found? A: The Actor logs a warning, counts it as failed, and continues processing the remaining packages.

Changelog

  • v1.0.0 — Initial release