PyPI Packages Scraper
Pricing
from $12.00 / 1,000 result items
PyPI Packages Scraper
Pull Python package data from PyPI. Returns name, version, summary, description, classifiers, license, author, project URLs (homepage, source, issues, docs), Python version requirement, dependencies, release history, last upload, and total release count. Direct lookup by package name.
Pricing
from $12.00 / 1,000 result items
Rating
0.0
(0)
Developer
ParseForge
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
4 days ago
Last modified
Categories
Share

🐍 PyPI Python Package Scraper
🚀 Pull PyPI packages with version, license, classifiers, dependencies, vulnerabilities, release files (wheel + sdist), funding URL, and 33 fields.
🕒 Last updated: 2026-05-08 · 📊 33+ fields per record · 500K+ PyPI packages · version, classifiers, dependencies, security advisories, release files (wheel + sdist sizes + sha256), license, project URLs
The PyPI Python Package Scraper pulls rich package metadata from the Python Package Index. Output includes name, version, summary, description (truncated), license + license expression + license files, author + email, maintainer email, homepage, repository, bug tracker, docs URL, changelog URL, funding URL, classifiers, runtime dependencies, provides_extra optionals, python version requirement, yanked flag, total releases, release files (wheel + sdist with size + SHA-256 + Python version), and security vulnerabilities published by the PyPI Security team.
Direct lookup only - feed a list of package names, get rich records back. The Actor uses the JSON detail endpoint, which is the canonical source for PyPI metadata.
| 🎯 Target Audience | 💡 Primary Use Cases |
|---|---|
| Python developers, security teams, SBOM builders, ML researchers, package-discovery tools, OSS analytics | Python supply chain analysis, vulnerability tracking, SBOM generation, dependency-graph extraction, ecosystem health monitoring |
📋 What the PyPI Python Package Scraper does
Five filtering workflows in a single run:
- 🆔 Direct lookup. One package per line, plain names.
- 🚨 Vulnerabilities included. Security advisories from the PyPI Security DB.
- 📦 Release files. Per-version wheel + sdist with sizes and SHA-256.
- ⚖️ License + license files. Standard license, license expression (PEP 639), license file paths.
- 🔗 Project URLs. Homepage, repo, bugs, docs, changelog, funding, all in one map.
💡 Why it matters: clean, server-side filtering and fresh data on every run.
🎬 Full Demo
🚧 Coming soon: a 3-minute walkthrough showing how to go from sign-up to a downloaded dataset.
⚙️ Input
| Input | Type | Default | Behavior |
|---|---|---|---|
maxItems | integer | 10 | Records to return. Free plan caps at 10, paid plan up to 1,000,000. |
names | string | "" | Newline-separated package names (one per line). |
Example: lookup popular ML packages.
{"maxItems": 10,"names": "numpy\npandas\nscikit-learn\ntensorflow\ntorch\ntransformers\nopenai\nlangchain"}
Example: audit production deps.
{"maxItems": 20,"names": "requests\nflask\nfastapi\ndjango\nuvicorn\ngunicorn"}
📊 Output
Each record contains 33+ fields. Download as CSV, Excel, JSON, or XML.
🧾 Schema
| Field | Type | Example |
|---|---|---|
📛 name | string | "numpy" |
🏷️ version | string | "2.3.4" |
📝 summary | string | "Fundamental package for array computing in Python" |
⚖️ license | string | "BSD-3-Clause" |
⚖️ licenseExpression | string | "BSD-3-Clause" |
👤 authorName | string | "Travis E. Oliphant et al." |
📧 authorEmail | string | "" |
🌐 homepage | string | "https://numpy.org" |
🔗 repositoryUrl | string | "https://github.com/numpy/numpy" |
🔗 fundingUrl | string | "https://numpy.org/about/" |
🏷️ classifiers | array | ["Development Status :: 5 - Production/Stable",...] |
📦 requiresDist | array | ["pytest >= 4.6"] |
🐍 pythonRequires | string | ">=3.10" |
🚨 yanked | boolean | false |
📊 totalReleases | number | 392 |
📊 releaseFileCount | number | 28 |
📦 releaseFiles | array of objects | [{"filename":"numpy-2.3.4-cp313-...whl","size":18724328,"sha256":"...","packagetype":"bdist_wheel"}] |
🚨 vulnerabilities | array of objects | [{"id":"PYSEC-2014-...","summary":"...","fixedIn":["1.6.0"]}] |
🌐 pypiUrl | string | "https://pypi.org/project/numpy/" |
📦 Sample records
✨ Why choose this Actor
| Capability | |
|---|---|
| 🚨 | Vulnerabilities included. PyPI security advisories with id, summary, fixed versions, link to source. |
| 📦 | Real release files. Wheel + sdist URLs with size + SHA-256, ready for SBOM. |
| ⚖️ | Modern license fields. license_expression (PEP 639) and license_files alongside the legacy license field. |
| 🔗 | Funding URLs. Direct funding / sponsor link if the project provides one. |
| 🆓 | No API key. PyPI is open. |
📈 How it compares to alternatives
| Approach | Cost | Coverage | Refresh | Filters | Setup |
|---|---|---|---|---|---|
| ⭐ This Actor | $5 free credit | 500K+ packages | Live per run | Lookup | ⚡ 2 min |
| PyPI direct API | Free | Same | Live | DIY | 🐢 Code |
| Snyk Python Advisor | $$ | Same | Live | Yes | 🐢 Account |
| pip search (deprecated) | - | - | - | - | - |
🚀 How to use
- 📝 Sign up. Create a free account with $5 credit (takes 2 minutes).
- 🌐 Open the Actor. Find the PyPI Python Package Scraper on the Apify Store.
- 🎯 Set input. Pick filters and
maxItems. - 🚀 Run it. Click Start.
- 📥 Download. Grab results in the Dataset tab as CSV, Excel, JSON, or XML.
⏱️ Total time from signup to dataset: 3-5 minutes. No coding required.
💼 Business use cases
🔌 Automating PyPI Python Package Scraper
Control the scraper programmatically:
- 🟢 Node.js. Install the
apify-clientNPM package. - 🐍 Python. Use the
apify-clientPyPI package. - 📚 See the Apify API documentation for full details.
The Apify Schedules feature lets you trigger this Actor on any cron interval.
🌟 Beyond business use cases
Data like this powers more than commercial workflows.
🤖 Ask an AI assistant about this scraper
Open a ready-to-send prompt in the AI of your choice:
- 💬 ChatGPT
- 🧠 Claude
- 🔍 Perplexity
- 🅒 Copilot
❓ Frequently Asked Questions
🧩 How does it work?
Provide a list of PyPI package names. The Actor calls the PyPI JSON detail endpoint for each, then maps the response to a clean record.
📊 How many fields per record?
33 base, expanding when the project provides funding URL, vulnerabilities, license files, etc.
🚨 How current are vulnerabilities?
Pulled live from PyPI's security advisory database (osv.dev backed).
📦 Do you list every release file?
Yes for the latest version: wheel + sdist with filename, URL, size, SHA-256, and Python version target.
⚖️ How is licensing represented?
Both the legacy license field and PEP 639 license_expression + license_files when present.
🐍 Are pre-releases included?
Yes. Yanked status is exposed too.
🆓 Do I need an API key?
No. PyPI is open.
🔁 Can I schedule runs?
Yes. Schedule weekly to monitor security advisories.
⚖️ Is this data free to use?
Yes. PyPI metadata is publicly available under the project's licensing.
💳 Do I need a paid Apify plan?
No. The free plan covers preview runs (10 records).
🔌 Integrate with any app
PyPI Python Package Scraper connects to any cloud service via Apify integrations:
- Make - Automate multi-step workflows
- Zapier - Connect with 5,000+ apps
- Slack - Get run notifications
- Airbyte - Pipe data into your warehouse
- GitHub - Trigger runs from commits
- Google Drive - Export datasets to Sheets
🔗 Recommended Actors
- 📦 npm Package Registry - Pull npm packages with version, downloads, dependencies, integrity
- 🦀 crates.io Rust Package - Pull Rust crates with edition, downloads, features, dependencies
- 🐳 Docker Hub Image Search - Pull Docker repositories with tags, stars, pull count, README
- 💎 RubyGems Ruby Package - Pull RubyGems with version, downloads, dependencies, ruby version
- 📊 Stack Exchange Questions - Search 170+ Stack Exchange Q&A sites
💡 Pro Tip: browse the complete ParseForge collection for more reference-data scrapers.
🆘 Need Help? Open our contact form to request a new scraper, propose a custom data project, or report an issue.
⚠️ Disclaimer: this Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by the Python Software Foundation, the PyPI maintainers, or any individual package author. All trademarks mentioned are the property of their respective owners. Only publicly available open data is collected.