NPM Package Scraper avatar

NPM Package Scraper

Pricing

Pay per event

Go to Apify Store
NPM Package Scraper

NPM Package Scraper

Pull rich metadata for any NPM package — current version, dependencies, weekly downloads, repo URL, license, keywords, README excerpt, deprecation flag. Free NPM registry + downloads API, no key required.

Pricing

Pay per event

Rating

0.0

(0)

Developer

DevilScrapes

DevilScrapes

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Categories

Share


🎯 What this scrapes

NPM publishes registry.npmjs.org/<package> for every package, plus api.npmjs.org/downloads/<range>/<pkg> for download counts. This Actor takes a list of package names (or scoped names like @apify/sdk), fans them out, and writes one structured row per package with the latest version metadata and last-week download count.

🔥 What we handle for you

  • 🛡️ Browser fingerprint rotationcurl-cffi impersonates real Chrome / Firefox / Safari TLS handshakes so the target sees a browser, not Python.
  • 🌐 Residential proxy rotation via Apify Proxy — fresh session and exit IP on every block.
  • 🔁 Retries with exponential backoff on 408 / 429 / 5xx — up to 5 attempts per page, Retry-After honoured.
  • 🧱 Rate-limit-aware pacing — when the target pushes back, we slow down instead of getting banned.
  • 🧊 Clean, typed dataset rows — Pydantic-validated, ISO-8601 timestamps, stable IDs, JSON / CSV / Excel export straight from the Apify Console.
  • 💰 Pay-Per-Event pricing — you only pay for results that hit your dataset. No data, no charge.

💡 Use cases

  • Bundle audit — score every direct dependency in your repo for downloads + license + deprecation.
  • Vendor consolidation — compare two competing libraries side-by-side on weekly downloads.
  • Hiring intel — find org members listed as maintainers on widely-used npm packages.
  • Release-radar dashboards — daily diffs on weekly_downloads to spot momentum shifts.

⚙️ How to use it

  1. Click Try for free at the top of the page.
  2. Fill in the input form — most fields have sensible defaults.
  3. Click Start. Output streams into the run's dataset.
  4. Export from Storage → Dataset as JSON, CSV, or Excel — or fetch via the API.

📥 Input

FieldTypeRequiredDefaultNotes
packagesarrayyes['express', 'react', '@apify/sdk']List of NPM package names. Scoped packages are supported (use the literal @scope/name form).
includeDownloadsbooleannoTrueAdds last-week download count via the api.npmjs.org downloads API. One extra request per package.
concurrencyintegerno8Parallel registry requests.
proxyConfigurationobjectno{'useApifyProxy': False}Proxy optional. The NPM registry is happy to be hit directly.

Example input

{
"packages": [
"express"
],
"includeDownloads": true,
"concurrency": 4,
"proxyConfiguration": {
"useApifyProxy": false
}
}

📤 Output

Every row is one dataset item.

FieldTypeNotes
namestringPackage name (matches npm exactly, including @scope).
versionstringCurrent latest dist-tag version.
description['string', 'null']Package description.
homepage['string', 'null']Project homepage URL.
license['string', 'null']SPDX license identifier.
author['string', 'null']Author display string.
maintainersarrayMaintainer logins.
keywordsarrayTags from package.json.
repository_url['string', 'null']Source repo URL (often git+https://github.com/...).
bugs_url['string', 'null']Bug-tracker URL if defined.
dist_tarball['string', 'null']Tarball download URL for the latest version.
engines['object', 'null']engines field — node/npm version constraints.
dependencies['object', 'null']Runtime dependencies map.
dev_dependencies['object', 'null']Dev dependencies map.
peer_dependencies['object', 'null']Peer dependencies map.
deprecated['string', 'null']Deprecation message if the version is deprecated.
weekly_downloads['integer', 'null']Downloads in the last 7 days (when includeDownloads=true).
package_urlstringnpmjs.com canonical URL.
published_at['string', 'null']Publish timestamp of the latest version.
scraped_atstringWhen this row was recorded.

Example output

{
"name": "express",
"version": "4.21.2",
"description": "Fast, unopinionated, minimalist web framework",
"license": "MIT",
"weekly_downloads": 31000000,
"package_url": "https://www.npmjs.com/package/express"
}

💰 Pricing

Pay-Per-Event — you pay only when these events fire:

EventUSDWhat it is
actor-start$0.005One-off warm-up charge per run
result$0.0015Per dataset item

Example: 1 000 results at the rates above ≈ $1.50. No subscription, no minimum, no card to start — Apify gives every new account $5 of free credit.

🚧 Limitations

We pull only the latest dist-tag. Pre-release tags, version-range search, and tarball content extraction are not in scope.

❓ FAQ

Are downloads exact?

NPM's downloads API is widely understood to be ~5% noisy due to bot filtering. Trust the trend, not the exact number.

Can I look up versions other than latest?

Not yet — pass name@version and we'll resolve to latest. Per-version lookup is a roadmap item.

What if a package was unpublished?

We log a 404 and skip the row.

Why is the repo URL prefixed with git+?

That's npm's canonical form. Strip the prefix if you need a bare URL.

💬 Your feedback

Spotted a bug, hit a weird edge case, or need a new field? Open an issue on the Actor's Issues tab on Apify Console — we ship fixes weekly and we read every report.