NPM Registry Scraper avatar

NPM Registry Scraper

Pricing

from $3.00 / 1,000 results

Go to Apify Store
NPM Registry Scraper

NPM Registry Scraper

Scrape NPM package metadata, version history, maintainers, dependents, and download stats from the public NPM registry. Search packages or pull a specific list of package names.

Pricing

from $3.00 / 1,000 results

Rating

5.0

(10)

Developer

Crawler Bros

Crawler Bros

Maintained by Community

Actor stats

10

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

Scrape the public NPM package registry — search packages or look up by exact name. Extract package metadata (name, latest version, description, license, keywords, author, maintainers, repository), score signals (quality / popularity / maintenance), and weekly download counts. HTTP-only via registry.npmjs.org and api.npmjs.org. No auth, no proxy, no rate-limit drama.

What this actor does

  • Two modes: search (free-text package search) and lookup (exact package names)
  • Rich metadata: title, latest version, dist-tags, description, keywords, license, homepage, repository, bugs, maintainers, author, created/modified timestamps
  • NPM search score signals: final score, quality, popularity, maintenance, plus the per-query searchScore
  • Download stats: weekly / monthly / daily download counts via api.npmjs.org/downloads
  • Filters: min weekly downloads, keyword tagged, SPDX license
  • Optional version history: pass includeVersions: true to emit the full sorted list of published versions
  • Empty fields are omitted — no nulls or blank strings reach the dataset

Output per package

  • name, latestVersion, distTags (when more than just latest exists)
  • description, keywords[]
  • license (SPDX where available; falls back to license.type / license.name dict shape)
  • homepageUrl, repositoryUrl, bugsUrl, npmUrl
  • author{name, email, url} (handles both string and dict shapes)
  • maintainers[] — array of {name, email, url}
  • createdAt, modifiedAt — ISO 8601 timestamps
  • downloadsLastWeek (or downloads_day / downloads_month per downloadsPeriod)
  • scoreFinal, score_quality, score_popularity, score_maintenance, searchScore (search mode only)
  • versions[] — when includeVersions=true
  • recordType: "package", scrapedAt

Input

FieldTypeDefaultDescription
modestringsearchsearch or lookup
searchQuerystringreactFree-text NPM search (mode=search)
packageNamesarrayExact package names (mode=lookup); supports scoped like @types/node
downloadsPeriodstringlast-weeklast-day / last-week / last-month
minDownloadsLastWeekintDrop packages below this weekly download count
keywordAnyOfarray[]Only emit packages tagged with at least one of these keywords
licenseAnyOfarray[]Only emit packages whose SPDX license matches
includeVersionsboolfalseEmit full version history
includeMaintainersbooltrueInclude maintainer list
includeDownloadsbooltrueFetch download counts
maxItemsint50Hard cap (1–5000)

Example: search for top React packages

{
"mode": "search",
"searchQuery": "react",
"minDownloadsLastWeek": 10000,
"licenseAnyOf": ["MIT"],
"maxItems": 50
}

Example: lookup specific packages

{
"mode": "lookup",
"packageNames": ["express", "lodash", "@types/node"],
"includeVersions": true
}

Example: TypeScript ecosystem audit

{
"mode": "search",
"searchQuery": "typescript types",
"keywordAnyOf": ["typescript"],
"maxItems": 200
}

Use cases

  • Open-source intelligence — track package adoption, license distribution, ecosystem trends
  • Security teams — audit dependencies, track maintainer churn, detect license changes
  • DevRel & growth — find similar / competing packages, monitor share of voice
  • VC due diligence — adoption telemetry on open-source startups (downloads, maintainers, repo activity)
  • Build tools — automate npm-trends-style dashboards
  • License compliance — bulk-fetch SPDX identifiers across an entire dependency tree

FAQ

Does it require an NPM account? No. The NPM registry is fully public.

Is there a rate limit? NPM is generous on the public registry (no documented hard limit for normal scraping). The actor inserts small delays between requests to stay polite.

What's the difference between latestVersion and distTags? latestVersion is the dist-tags.latest value (the "stable" release). distTags includes any other tags like next, beta, legacy.

Why are some fields empty? The actor omits empty fields rather than emit nulls. For example, bugsUrl is only present if the package's package.json lists one.

Can I get download history (time series)? Not in v1 — we emit a single point-in-time count for the chosen period. For time-series download data, query api.npmjs.org/downloads/range/... separately.

How do scoped packages work? Use the full scoped name (e.g. @types/node) in packageNames. The actor URL-encodes correctly.

Are tarball / unpacked size included? Not in v1. The registry payload exposes them per-version under versions[<v>].dist. We can add a latestTarball field if needed.

What's the difference between score.final and searchScore? score.final is NPM's general quality score (0–1) computed from quality, popularity, and maintenance. searchScore is the relevance score for your specific search query.