PyPI Packages Scraper avatar

PyPI Packages Scraper

Pricing

from $12.00 / 1,000 result items

Go to Apify Store
PyPI Packages Scraper

PyPI Packages Scraper

Pull Python package data from PyPI. Returns name, version, summary, description, classifiers, license, author, project URLs (homepage, source, issues, docs), Python version requirement, dependencies, release history, last upload, and total release count. Direct lookup by package name.

Pricing

from $12.00 / 1,000 result items

Rating

0.0

(0)

Developer

ParseForge

ParseForge

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Share

ParseForge Banner

🐍 PyPI Python Package Scraper

🚀 Pull PyPI packages with version, license, classifiers, dependencies, vulnerabilities, release files (wheel + sdist), funding URL, and 33 fields.

🕒 Last updated: 2026-05-08 · 📊 33+ fields per record · 500K+ PyPI packages · version, classifiers, dependencies, security advisories, release files (wheel + sdist sizes + sha256), license, project URLs

The PyPI Python Package Scraper pulls rich package metadata from the Python Package Index. Output includes name, version, summary, description (truncated), license + license expression + license files, author + email, maintainer email, homepage, repository, bug tracker, docs URL, changelog URL, funding URL, classifiers, runtime dependencies, provides_extra optionals, python version requirement, yanked flag, total releases, release files (wheel + sdist with size + SHA-256 + Python version), and security vulnerabilities published by the PyPI Security team.

Direct lookup only - feed a list of package names, get rich records back. The Actor uses the JSON detail endpoint, which is the canonical source for PyPI metadata.

🎯 Target Audience💡 Primary Use Cases
Python developers, security teams, SBOM builders, ML researchers, package-discovery tools, OSS analyticsPython supply chain analysis, vulnerability tracking, SBOM generation, dependency-graph extraction, ecosystem health monitoring

📋 What the PyPI Python Package Scraper does

Five filtering workflows in a single run:

  • 🆔 Direct lookup. One package per line, plain names.
  • 🚨 Vulnerabilities included. Security advisories from the PyPI Security DB.
  • 📦 Release files. Per-version wheel + sdist with sizes and SHA-256.
  • ⚖️ License + license files. Standard license, license expression (PEP 639), license file paths.
  • 🔗 Project URLs. Homepage, repo, bugs, docs, changelog, funding, all in one map.

💡 Why it matters: clean, server-side filtering and fresh data on every run.


🎬 Full Demo

🚧 Coming soon: a 3-minute walkthrough showing how to go from sign-up to a downloaded dataset.


⚙️ Input

InputTypeDefaultBehavior
maxItemsinteger10Records to return. Free plan caps at 10, paid plan up to 1,000,000.
namesstring""Newline-separated package names (one per line).

Example: lookup popular ML packages.

{
"maxItems": 10,
"names": "numpy\npandas\nscikit-learn\ntensorflow\ntorch\ntransformers\nopenai\nlangchain"
}

Example: audit production deps.

{
"maxItems": 20,
"names": "requests\nflask\nfastapi\ndjango\nuvicorn\ngunicorn"
}

📊 Output

Each record contains 33+ fields. Download as CSV, Excel, JSON, or XML.

🧾 Schema

FieldTypeExample
📛 namestring"numpy"
🏷️ versionstring"2.3.4"
📝 summarystring"Fundamental package for array computing in Python"
⚖️ licensestring"BSD-3-Clause"
⚖️ licenseExpressionstring"BSD-3-Clause"
👤 authorNamestring"Travis E. Oliphant et al."
📧 authorEmailstring""
🌐 homepagestring"https://numpy.org"
🔗 repositoryUrlstring"https://github.com/numpy/numpy"
🔗 fundingUrlstring"https://numpy.org/about/"
🏷️ classifiersarray["Development Status :: 5 - Production/Stable",...]
📦 requiresDistarray["pytest >= 4.6"]
🐍 pythonRequiresstring">=3.10"
🚨 yankedbooleanfalse
📊 totalReleasesnumber392
📊 releaseFileCountnumber28
📦 releaseFilesarray of objects[{"filename":"numpy-2.3.4-cp313-...whl","size":18724328,"sha256":"...","packagetype":"bdist_wheel"}]
🚨 vulnerabilitiesarray of objects[{"id":"PYSEC-2014-...","summary":"...","fixedIn":["1.6.0"]}]
🌐 pypiUrlstring"https://pypi.org/project/numpy/"

📦 Sample records


✨ Why choose this Actor

Capability
🚨Vulnerabilities included. PyPI security advisories with id, summary, fixed versions, link to source.
📦Real release files. Wheel + sdist URLs with size + SHA-256, ready for SBOM.
⚖️Modern license fields. license_expression (PEP 639) and license_files alongside the legacy license field.
🔗Funding URLs. Direct funding / sponsor link if the project provides one.
🆓No API key. PyPI is open.

📈 How it compares to alternatives

ApproachCostCoverageRefreshFiltersSetup
⭐ This Actor$5 free credit500K+ packagesLive per runLookup⚡ 2 min
PyPI direct APIFreeSameLiveDIY🐢 Code
Snyk Python Advisor$$SameLiveYes🐢 Account
pip search (deprecated)-----

🚀 How to use

  1. 📝 Sign up. Create a free account with $5 credit (takes 2 minutes).
  2. 🌐 Open the Actor. Find the PyPI Python Package Scraper on the Apify Store.
  3. 🎯 Set input. Pick filters and maxItems.
  4. 🚀 Run it. Click Start.
  5. 📥 Download. Grab results in the Dataset tab as CSV, Excel, JSON, or XML.

⏱️ Total time from signup to dataset: 3-5 minutes. No coding required.


💼 Business use cases

🔐 Supply Chain

  • Python SBOM generation
  • License compliance
  • Integrity-hash verification
  • Vulnerability dashboards

🤖 ML + Data Science

  • Track new ML library releases
  • Pin reproducible env snapshots
  • Compare libraries
  • Discover new packages

📊 Ecosystem Analytics

  • Top-PyPI rankings
  • License distribution
  • Author / org stats
  • Yank-rate tracking

🎓 Education + Research

  • Reproducible PyPI snapshots
  • Course material
  • Hobbyist exploration
  • Library comparison projects

🔌 Automating PyPI Python Package Scraper

Control the scraper programmatically:

  • 🟢 Node.js. Install the apify-client NPM package.
  • 🐍 Python. Use the apify-client PyPI package.
  • 📚 See the Apify API documentation for full details.

The Apify Schedules feature lets you trigger this Actor on any cron interval.


🌟 Beyond business use cases

Data like this powers more than commercial workflows.

🎓 Research and academia

  • Software-engineering studies
  • OSS health datasets
  • Network analysis on Python deps
  • Reproducible PyPI corpora

🎨 Personal and creative

  • Personal package dashboards
  • Curated PyPI lists
  • Side projects with metadata
  • Library-discovery sites

🤝 Non-profit and civic

  • Free SBOM tools
  • OSS security awareness
  • Educational maps
  • Civic tech inventories

🧪 Experimentation

  • Train recommenders
  • Prototype security scanners
  • Build vulnerability bots
  • Test license tooling

🤖 Ask an AI assistant about this scraper

Open a ready-to-send prompt in the AI of your choice:


❓ Frequently Asked Questions

🧩 How does it work?

Provide a list of PyPI package names. The Actor calls the PyPI JSON detail endpoint for each, then maps the response to a clean record.

📊 How many fields per record?

33 base, expanding when the project provides funding URL, vulnerabilities, license files, etc.

🚨 How current are vulnerabilities?

Pulled live from PyPI's security advisory database (osv.dev backed).

📦 Do you list every release file?

Yes for the latest version: wheel + sdist with filename, URL, size, SHA-256, and Python version target.

⚖️ How is licensing represented?

Both the legacy license field and PEP 639 license_expression + license_files when present.

🐍 Are pre-releases included?

Yes. Yanked status is exposed too.

🆓 Do I need an API key?

No. PyPI is open.

🔁 Can I schedule runs?

Yes. Schedule weekly to monitor security advisories.

⚖️ Is this data free to use?

Yes. PyPI metadata is publicly available under the project's licensing.

💳 Do I need a paid Apify plan?

No. The free plan covers preview runs (10 records).


🔌 Integrate with any app

PyPI Python Package Scraper connects to any cloud service via Apify integrations:

  • Make - Automate multi-step workflows
  • Zapier - Connect with 5,000+ apps
  • Slack - Get run notifications
  • Airbyte - Pipe data into your warehouse
  • GitHub - Trigger runs from commits
  • Google Drive - Export datasets to Sheets

💡 Pro Tip: browse the complete ParseForge collection for more reference-data scrapers.


🆘 Need Help? Open our contact form to request a new scraper, propose a custom data project, or report an issue.


⚠️ Disclaimer: this Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by the Python Software Foundation, the PyPI maintainers, or any individual package author. All trademarks mentioned are the property of their respective owners. Only publicly available open data is collected.