PyPI Package Scraper avatar

PyPI Package Scraper

Pricing

from $2.00 / 1,000 results

Go to Apify Store
PyPI Package Scraper

PyPI Package Scraper

Extract Python package data from PyPI including versions, downloads, dependencies, and maintainers. Monitor Python ecosystem and security.

Pricing

from $2.00 / 1,000 results

Rating

0.0

(0)

Developer

Stephan Corbeil

Stephan Corbeil

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

0

Monthly active users

a day ago

Last modified

Categories

Share

PyPI Package Scraper by nexgendata

Extract Python package metadata from PyPI including download statistics, version history, dependency trees, maintainer info, license types, and project URLs at scale. Built for devtool companies analyzing the Python ecosystem and anyone who needs structured developer tools data without the overhead of building a custom scraper.

What This Actor Does

The PyPI Package Scraper connects to PyPI and extracts Python package metadata from PyPI including download statistics, version history, dependency trees, maintainer info, license types, and project URLs. It handles pagination, rate limiting, and data normalization automatically so you get clean, structured JSON output ready for your database, dashboard, or analytics pipeline. No API keys to manage, no infrastructure to maintain.

Who Uses This

Devtool companies analyzing the python ecosystem, security teams auditing supply chain dependencies, investors evaluating open source adoption, and developer advocates tracking package popularity. If you need developer tools data at scale without building and maintaining your own extraction pipeline, this actor handles the heavy lifting.

What You Get Back

Each run produces a structured dataset in JSON format. Every record includes all available fields from the source, normalized into a consistent schema. The data is immediately available for export in JSON, CSV, or Excel format, or you can push it directly to your data warehouse via Apify integrations with Google Sheets, Slack, Webhooks, and 50+ other platforms.

How It Compares

PyPI JSON API exists but is rate-limited and doesn't provide download stats. Libraries.io API caps at 60 requests/minute. Google BigQuery PyPI dataset requires SQL expertise and BigQuery costs. This actor delivers the same data at $2 per 1,000 packages with zero monthly commitment, no API key management, and results available in seconds. Pay only for what you use.

Sample Output

{
"source": "pypi-scraper",
"data": "Structured developer tools data fields",
"timestamp": "2024-03-29T12:00:00Z",
"url": "https://example.com/source"
}

Use Cases

Teams use the PyPI Package Scraper across a range of workflows. Analysts feed the output into business intelligence dashboards for real-time monitoring. Developers integrate it into automated data pipelines that run on daily or weekly schedules. Researchers use bulk exports for large-scale analysis projects. Marketing teams track competitive movements and industry trends. The structured output format means the data slots into virtually any downstream system with minimal transformation.

Pricing: $2 per 1,000 Packages

At $2/1K, processing 5,000 packages costs $10.00 total. A daily pipeline pulling 500 packages runs $1.00/day ($30/month). Compare that to building and maintaining your own scraping infrastructure, which typically costs $500-2,000/month in proxy fees, compute, and engineering time alone.

FAQ

How often can I run this? As often as you need. Schedule runs hourly, daily, or weekly through Apify's built-in scheduler, or trigger runs via API from your own systems.

What format is the output? JSON by default, with one-click export to CSV or Excel. You can also push results directly to Google Sheets, webhooks, or any HTTP endpoint via Apify integrations.

Do I need any API keys? No. The actor handles all authentication and access internally. Just configure your search parameters and run.

Can I integrate this with my existing tools? Yes. Apify supports integrations with Zapier, Make, Google Sheets, Slack, and direct webhook delivery. You can also use the Apify API to pull results programmatically into any system.