Pricing

Pay per event

NPM Package Scraper — npm metadata api

Pull rich metadata for any NPM package via the npm registry API — current version, dependencies, weekly downloads, repo URL, license, keywords, README excerpt, deprecation flag — export to JSON or CSV. Free npm registry + downloads API, no key required.

Pricing

Pay per event

Rating

0.0

(0)

Developer

DevilScrapes

Actor stats

Bookmarked

Total users

Monthly active users

2 months ago

Last modified

🎯 What this scrapes

The NPM registry exposes a JSON endpoint at registry.npmjs.org/<package> for every published package, plus a separate download-count API at api.npmjs.org/downloads/point/last-week/<pkg>. This Actor accepts a list of package names (including scoped packages like @apify/sdk), fans them out in parallel, merges both API responses, and writes one clean structured row per package. Think of it as a production-grade npm metadata api wrapper — no auth tokens, no manual pagination, no stitching two endpoints together yourself.

🔥 Features

What the Actor does:

Parallel bulk lookup — configurable concurrency (default 8) across the registry + downloads APIs; process hundreds of packages in a single run.
Scoped package support — @scope/name form works natively, no URL-encoding gymnastics required.
Weekly download counts — optional merge with the NPM downloads API (one extra request per package, toggleable).
Deprecation detection — surfaces the full deprecation message when a version is marked deprecated, so your audit pipeline catches it automatically.
Full dependency maps — dependencies, devDependencies, and peerDependencies as structured objects, not raw strings.
Structured, validated output — Pydantic-validated rows with ISO-8601 timestamps and stable field names; export as JSON, CSV, or Excel from Apify Console in one click.

What we handle for you:

🛡️ Browser fingerprint rotation — curl-cffi replays real Chrome / Firefox / Safari TLS handshakes so every request looks like a genuine browser, not a Python script.
🌐 Residential proxy rotation via Apify Proxy — fresh session ID and exit IP on every block so you never burn a single IP.
🔁 Retries with exponential backoff on 408 / 429 / 5xx — up to 5 attempts per package, Retry-After respected.
🧱 Rate-limit-aware pacing — we slow down when the target pushes back instead of getting the run banned.
🧊 Clean, typed dataset rows — Pydantic-validated, ISO-8601 timestamps, stable field names, JSON / CSV / Excel export straight from Apify Console.
💰 Pay-Per-Event pricing — you pay only for results that land in your dataset. No data, no charge.

💡 Use cases

Dependency audit — score every package in your package.json for weekly downloads, license compliance, and deprecation status before merging.
Vendor benchmarking — compare competing libraries side-by-side on download trends, maintenance activity, and known issues.
Supply-chain monitoring — feed the output into Socket, Snyk, or a custom risk dashboard to catch newly-deprecated dependencies before they ship.
SDK download leaderboards — track weekly download momentum for your own packages and competitors over time.
AI / RAG knowledge graphs — seed an LLM index with structured npm metadata api responses for package-aware code assistants.
Hiring intel — find org members listed as maintainers on widely-used packages to inform developer outreach.

⚙️ How to use it

Click Try for free at the top of the Store page.
Paste your package list into the packages field — one name per line or as a JSON array, scoped packages included.
Toggle includeDownloads on if you want last-week download counts (adds one extra request per package).
Click Start. Output streams into the run's dataset in real time.
Export from Storage → Dataset as JSON, CSV, or Excel — or pull via the Apify API with your token.

📥 Input

Field	Type	Required	Default	Notes
`packages`	`array`	yes	`["express", "react", "@apify/sdk"]`	Package names to look up. Scoped packages (`@scope/name`) are supported.
`includeDownloads`	`boolean`	no	`true`	Merges last-week download count from the NPM downloads API. One extra request per package.
`concurrency`	`integer`	no	`8`	Parallel requests to the registry. Raise carefully — aggressive concurrency can trigger rate-limiting.
`proxyConfiguration`	`object`	no	`{"useApifyProxy": true}`	Proxy settings. We recommend leaving this on Apify Proxy — it keeps session state consistent across retries.

Example input

{
  "packages": [
    "express",
    "react",
    "@apify/sdk"
  ],
  "includeDownloads": true,
  "concurrency": 8,
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}

📤 Output

One dataset row per package.

Field	Type	Notes
`name`	`string`	Package name, including `@scope` if present.
`version`	`string`	Current `latest` dist-tag version.
`description`	`string \| null`	Package description string.
`homepage`	`string \| null`	Project homepage URL.
`license`	`string \| null`	SPDX license identifier (e.g. `MIT`, `Apache-2.0`).
`author`	`string \| null`	Author display string.
`maintainers`	`array`	List of maintainer login names.
`keywords`	`array`	Tags from `package.json`.
`repository_url`	`string \| null`	Source repo URL (typically `git+https://github.com/...`).
`bugs_url`	`string \| null`	Bug-tracker URL if defined.
`dist_tarball`	`string \| null`	Tarball download URL for the `latest` version.
`engines`	`object \| null`	Node/npm version constraints from the `engines` field.
`dependencies`	`object \| null`	Runtime dependency map (`name → version range`).
`dev_dependencies`	`object \| null`	Dev dependency map.
`peer_dependencies`	`object \| null`	Peer dependency map.
`deprecated`	`string \| null`	Full deprecation message when the version is deprecated; `null` otherwise.
`weekly_downloads`	`integer \| null`	Downloads in the last 7 days (populated when `includeDownloads` is `true`).
`package_url`	`string`	Canonical `npmjs.com` package page URL.
`published_at`	`string \| null`	ISO-8601 publish timestamp for the `latest` version.
`scraped_at`	`string`	ISO-8601 timestamp when this row was recorded.

Example output

{
  "name": "express",
  "version": "4.21.2",
  "description": "Fast, unopinionated, minimalist web framework",
  "license": "MIT",
  "maintainers": ["wesleytodd", "dougwilson"],
  "keywords": ["express", "framework", "sinatra", "web", "rest", "restful", "router", "app"],
  "repository_url": "git+https://github.com/expressjs/express.git",
  "dependencies": {
    "accepts": "~1.3.8",
    "array-flatten": "1.1.1"
  },
  "deprecated": null,
  "weekly_downloads": 31000000,
  "package_url": "https://www.npmjs.com/package/express",
  "published_at": "2024-03-25T14:09:03.000Z",
  "scraped_at": "2026-06-01T10:00:00.000Z"
}

💰 Pricing

Pay-Per-Event — you pay only when these events fire:

Event	Cost (USD)	What it covers
`actor-start`	$0.005	One-off warm-up charge per run
`result`	$0.0015	Per dataset row written

Example: 1 000 packages at the rates above ≈ $1.50. No subscription, no minimum, no card to start — every new Apify account gets $5 of free credit.

Compared to calling the npm metadata api yourself: no rate-limit management, no session handling, no merging two separate API endpoints, no retry logic. We charge $1.50 per thousand; you save the engineering hours.

🚧 Limitations

latest dist-tag only — per-version lookup (e.g. express@4.18.0) is not yet supported.
Pre-release tags excluded — alpha, beta, next dist-tags are not resolved.
Tarball content not extracted — we return the tarball URL; we do not download or unpack it.
Downloads API cap — the NPM downloads API hard-caps at 128 packages per bulk call; we fan out automatically, but very large batches will take proportionally longer.
Private packages — scoped private packages (@org/internal-pkg) return a 404 from the public registry and are skipped with a log warning.

❓ FAQ

What exactly is the npm metadata api?

The NPM registry exposes https://registry.npmjs.org/<package> as a JSON document with full package metadata — versions, maintainers, dependencies, dist tarballs, and more. A separate endpoint at https://api.npmjs.org/downloads/point/last-week/<pkg> returns download counts. This Actor calls both, merges the responses, and delivers clean structured rows. You get a production-grade npm metadata api pipeline without maintaining the plumbing yourself.

Is this an npm registry scraper or an API wrapper?

Both. The registry exposes clean JSON, but stitching two API hosts, handling scoped package names, managing retries, and staying inside rate limits is real engineering work. We do all of that and return validated rows. Think of it as an npm registry scraper that handles the messy bits for you.

Are download counts exact?

NPM's downloads API is widely understood to include a ~5% noise band due to bot-traffic filtering on their end. Trust the trend and relative magnitude; don't treat individual numbers as exact.

Can I look up a specific version instead of latest?

Not yet. Per-version lookup is on the roadmap — pass name@version syntax and we resolve to latest for now. Vote or comment on the Actor's Issues tab to prioritise this.

What happens when a package has been unpublished?

We log a 404, skip the row, and continue with the remaining packages. The final dataset will be shorter than your input list; the run log will list every skipped package.

Why is the repository_url prefixed with git+?

That is npm's canonical format. Strip the git+ prefix if your downstream tool expects a bare HTTPS URL.

Can I use this alongside the PyPI Package Scraper?

Yes — Devil Scrapes PyPI Package Scraper does the same job for Python packages. Run both and join on package purpose to produce an npm-vs-PyPI ecosystem comparison dataset.

💬 Your feedback

Spotted a bug, hit a weird edge case, or need an additional field? Open an issue on the Actor's Issues tab in Apify Console — we ship fixes weekly and we read every report.

npm Package Registry Scraper — Downloads & Stats

rupom888/npm-registry-scraper-nodejs

Scrapes the npm public registry for package metadata, weekly/monthly/yearly download counts, dependency information, and keyword search results. Uses the free public npm registry API and npm downloads API — no authentication or API key required.

Syed Rupom

Npm Package Downloads Api

jeeves_is_my_copilot/npm-package-downloads-api

Fetch npm package download counts from the public npm downloads API.

Alexander Abernathy

NPM Package Stats & Downloads Lookup (Free)

fit_melon/npm-package-stats-lookup

Bulk-lookup npm packages: latest version, weekly & monthly downloads, dependencies, license, repository, maintainers, publish dates, deprecation. Official npm registry API, clean JSON. 100% free.

D N

Npm Registry Scraper

klondikeking/npm-registry-scraper

Pierrick McD0nald

NPM Package Scraper

glassventures/npm-package-scraper

Scrape NPM package data. Extract name, version, downloads, dependencies, maintainers, and more. Export to JSON, CSV, Excel.

Glass Ventures

npm Package Stats

scrupulous_waterbird_m4w/npm-package-stats

Fetch npm package metadata, versions, dependencies, repository links, maintainers, and download counts from public npm registry APIs.

Mori

NPM Package Stats

lafuan/npm-package-stats

Extract npm package stats: downloads, version, license, descriptions

Muhammad Naufal

NPM Package Scraper

ef12/npm-package-scraper

Search npm packages, get package metadata, download counts, version history, and dependencies. Track trending packages by download volume using the npm registry API.

Daniel Wilson

NPM Registry Scraper

crawlerbros/npm-registry-scraper

Scrape NPM package metadata, version history, maintainers, dependents, and download stats from the public NPM registry. Search packages or pull a specific list of package names.

Crawler Bros

npm Registry Scraper - Search & Download Stats

parseforge/npm-registry-scraper

Search and scrape npm package data including versions, descriptions, authors, licenses, keywords, and weekly/total download counts from the public npm registry API.