npm Registry Scraper avatar

npm Registry Scraper

Pricing

from $2.50 / 1,000 results

Go to Apify Store
npm Registry Scraper

npm Registry Scraper

[๐Ÿ’ฐ $2.5 / 1K] Search the public npm registry or look up exact packages, and extract package metadata, version history, maintainers, repository and homepage links, dependencies, and download statistics (day/week/month).

Pricing

from $2.50 / 1,000 results

Rating

0.0

(0)

Developer

SolidCode

SolidCode

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Share

Pull structured package data from the public npm registry at scale โ€” names, versions, maintainers, repository links, license, dependencies, and day / week / month download counts for every package you search or name directly. Search by free-text query (with npm's own qualifiers) or fetch exact packages including scoped @org/name, and get back a clean, consistent dataset every time. Built for developers auditing dependencies, security and supply-chain teams vetting packages, and market-research and data teams who need a reliable npm dataset without stitching together the registry's scattered endpoints themselves.

Why This Scraper?

  • 31-field flat schema, every key always present โ€” one tidy row per package with explicit null when a value is missing, so your CSV, Sheets, and database columns never shift between runs.
  • Two modes, no technical toggle โ€” type a query to search, or paste exact package names to look them up directly. Scoped packages like @types/node and @angular/core are fully supported; the mode is inferred for you.
  • npm search qualifiers built in โ€” use author:sindresorhus, keywords:cli, or is:unstable right inside the search box to pull a maintainer's whole portfolio or a single keyword's ecosystem.
  • Day, week, and month download counts together โ€” all three windows on every row, not a single period you have to choose up front.
  • Dependencies, devDependencies AND peerDependencies โ€” the latest manifest's full runtime, dev, and peer dependency lists, something most npm scrapers skip entirely.
  • Full version history with first-publish dates โ€” the complete time-ordered list of every published version plus the package's original createdAt timestamp, on demand.
  • SPDX license and weekly-download filtering โ€” keep only packages under licenses you allow (MIT, Apache-2.0, ISC) and above a weekly-downloads floor, so abandoned packages drop out before they reach your dataset.
  • Multi-keyword search, deduped โ€” pass several keyword tags and get a single merged, duplicate-free result set ranked by relevance or re-sorted by most downloaded.
  • Up to 10,000 packages per run โ€” fetch a fixed count or set 0 to walk an entire search to completion under a built-in safety cap.

Use Cases

Dependency Auditing

  • Snapshot the exact versions, licenses, and maintainers behind every package in a project's tree
  • Compare runtime, dev, and peer dependencies across candidate libraries before adopting one
  • Track first-publish and last-modified dates to spot stale or recently revived packages
  • Build an internal allowlist of packages that meet your license and download thresholds

Supply-Chain Security

  • Vet new dependencies by maintainer, publisher, and version cadence before they reach production
  • Filter out low-traffic packages below a weekly-downloads floor that are easier targets for takeover
  • Pull the full version history to review release frequency and sudden ownership changes
  • Cross-reference repository, homepage, and issue-tracker URLs to confirm a package's provenance

Market & Ecosystem Research

  • Map an entire ecosystem with keywords: qualifiers โ€” every CLI, testing, or TypeScript package
  • Rank a topic's packages by weekly downloads to find the real leaders versus the long tail
  • Monitor adoption trends by re-running searches and comparing download counts over time
  • Profile a framework's surrounding tooling by license, maintainer, and dependency footprint

Developer Relations & Lead Generation

  • Pull a single maintainer's entire published portfolio with the author: qualifier
  • Build contact lists of package maintainers (name and email) for outreach and partnerships
  • Identify the most-downloaded packages in your category to find integration and sponsorship targets
  • Track which authors dominate a keyword to focus community and DevRel efforts

Dataset Building

  • Generate clean, consistent package datasets for dashboards, internal tools, and ML pipelines
  • Export every field as JSON, CSV, or Excel with stable columns that never drift between runs
  • Enrich an existing package list by looking up exact names in bulk, scoped names included
  • Feed npm metadata into analytics warehouses with predictable one-row-per-package billing

Getting Started

{
"searchQuery": "react"
}
{
"searchQuery": "http client",
"licenses": ["MIT"],
"minDownloadsLastWeek": 100000,
"sortBy": "downloads",
"maxResults": 100
}

Exact lookup with full detail

{
"packageNames": ["lodash", "express", "@types/node", "@angular/core"],
"includeVersions": true,
"includeDependencies": true
}

Input Reference

What to Fetch

ParameterTypeDefaultDescription
searchQuerystring"react"Free-text search across names, descriptions, and keywords. Supports npm qualifiers like keywords:cli, author:sindresorhus, and is:unstable. Leave empty if you only want exact package lookups.
packageNamesarray[]Fetch specific packages directly, including scoped names like @types/node. When provided, the search query and search filters are ignored.

Filters (search mode only)

ParameterTypeDefaultDescription
keywordsarray[]Keep only packages tagged with at least one of these npm keywords. Multiple keywords are merged and deduplicated.
licensesarray[]Keep only packages published under at least one of these SPDX license identifiers (MIT, Apache-2.0, ISC).
minDownloadsLastWeekintegernullKeep only packages with at least this many downloads in the last week โ€” a quick way to drop abandoned or rarely used packages.
sortBystring"Best match"Best match keeps npm's own relevance ranking; Most downloaded re-orders the packages you fetch with the highest weekly downloads first.

What to Include

ParameterTypeDefaultDescription
includeVersionsbooleanfalseAdd the complete, time-ordered list of every published version with release dates, plus the package's first-publish date.
includeMaintainersbooleantrueAdd the maintainer list (name and email) to each package.
includeDownloadsbooleantrueAdd download counts for the last day, last week, and last month.
includeDependenciesbooleanfalseAdd the latest version's runtime, dev, and peer dependency lists.

Results

ParameterTypeDefaultDescription
maxResultsinteger50The most packages to return for a search. Set to 0 to fetch every match, with a safety cap of 10,000. Ignored when fetching exact package names.

Output

Every package is returned as one flat row. Below is a representative result (download statistics, versions, and dependencies shown with all optional toggles on).

{
"name": "express",
"latestVersion": "4.21.2",
"distTags": { "latest": "4.21.2", "next": "5.0.0" },
"description": "Fast, unopinionated, minimalist web framework",
"keywords": ["express", "framework", "web", "rest", "router"],
"license": "MIT",
"author": { "name": "TJ Holowaychuk", "email": "tj@vision-media.ca", "url": null },
"publisher": "wesleytodd",
"maintainers": [{ "name": "wesleytodd", "email": "wes@example.com" }],
"homepageUrl": "http://expressjs.com/",
"repositoryUrl": "https://github.com/expressjs/express",
"bugsUrl": "https://github.com/expressjs/express/issues",
"npmUrl": "https://www.npmjs.com/package/express",
"createdAt": "2010-12-29T19:38:25.450Z",
"modifiedAt": "2024-12-09T17:21:04.012Z",
"versionCount": 281,
"versions": [{ "version": "4.21.2", "publishedAt": "2024-12-09T17:21:03.000Z" }],
"dependencies": { "accepts": "~1.3.8", "body-parser": "1.20.3" },
"devDependencies": { "mocha": "10.7.3", "supertest": "6.3.0" },
"peerDependencies": {},
"downloadsLastDay": 4218903,
"downloadsLastWeek": 28734512,
"downloadsLastMonth": 121908447,
"scoreFinal": 0.91,
"scoreQuality": 0.95,
"scorePopularity": 0.99,
"scoreMaintenance": 0.78,
"searchScore": 100023.45,
"dependentsCount": 84213,
"recordType": "package",
"scrapedAt": "2026-06-13T10:42:00.000Z"
}

Core Fields

FieldTypeDescription
namestringPackage name, including scope (e.g. @types/node).
latestVersionstringVersion published under the latest dist-tag.
distTagsobjectAll distribution tags mapped to their version (latest, next, beta).
descriptionstringShort package description.
keywordsarrayKeyword tags declared by the package.
licensestringSPDX license identifier.
recordTypestringAlways "package".
scrapedAtstringISO 8601 timestamp of when the row was collected.
FieldTypeDescription
homepageUrlstringPackage homepage.
repositoryUrlstringSource repository URL, normalized to a browsable link.
bugsUrlstringIssue tracker URL.
npmUrlstringCanonical npm page for the package.

People

FieldTypeDescription
authorobjectDeclared author { name, email, url }.
publisherstringUsername that published the indexed version.
maintainersarrayList of maintainers { name, email } (when maintainers are included).

Timestamps & Versions

FieldTypeDescription
createdAtstringFirst-publish timestamp of the package (when versions are included).
modifiedAtstringLast-modified timestamp (when versions are included).
versionCountintegerTotal number of published versions.
versionsarrayTime-ordered { version, publishedAt } list (when versions are included).

Downloads & Scores

FieldTypeDescription
downloadsLastDayintegerDownloads in the last day (when downloads are included).
downloadsLastWeekintegerDownloads in the last week.
downloadsLastMonthintegerDownloads in the last month.
scoreFinalnumbernpm blended search score (search mode).
scoreQualitynumberQuality sub-score.
scorePopularitynumberPopularity sub-score.
scoreMaintenancenumberMaintenance sub-score.
searchScorenumbernpm search relevance score (search mode).
dependentsCountintegerNumber of packages that depend on this one, when reported.

Dependencies

FieldTypeDescription
dependenciesobjectRuntime dependencies of the latest version (when dependencies are included).
devDependenciesobjectDevelopment dependencies of the latest version.
peerDependenciesobjectPeer dependencies of the latest version.

Tips for Best Results

  • Use the author: qualifier in the search box to pull a maintainer's entire published portfolio in one run โ€” for example author:sindresorhus.
  • List scoped packages exactly as @scope/name in Exact package names โ€” @babel/core, @types/node, @angular/core โ€” and the search box is ignored for those runs.
  • Turn on Include full version history only when you need release timelines or the first-publish date; it pulls each package's full document and adds the most weight to a run.
  • Set a Minimum weekly downloads floor (try 10,000 or 100,000) to cut abandoned packages out of broad searches before they reach your dataset.
  • Very high download thresholds or a rare license scan many packages to find enough matches, so a run can take a few minutes; pair them with a specific search query (rather than a broad one) to keep runs fast.
  • Choose Most downloaded sorting to surface the real leaders in a category; it re-orders the packages you fetch by weekly downloads.
  • Combine multiple Keyword tags to map a whole ecosystem at once โ€” results are merged and de-duplicated automatically.
  • Set Maximum packages to 0 to walk an entire search to completion, bounded by the 10,000-package safety cap.

Pricing

From $2.50 per 1,000 results โ€” undercuts comparable npm registry actors while returning a richer, fully consistent schema. Bronze, Silver, and Gold subscribers pay progressively less; the table below shows total cost at each discount tier.

ResultsNo discountBronzeSilverGold
100$0.30$0.28$0.27$0.25
1,000$3.00$2.80$2.65$2.50
10,000$30.00$28.00$26.50$25.00
100,000$300.00$280.00$265.00$250.00

One result is one package row in your dataset. No compute or time-based charges โ€” you pay per result, plus a small fixed per-run start fee.

Integrations

Export data in JSON, CSV, Excel, XML, or RSS. Connect to 1,500+ apps via:

  • Zapier / Make / n8n โ€” Workflow automation
  • Google Sheets โ€” Direct spreadsheet export
  • Slack / Email โ€” Notifications on new results
  • Webhooks โ€” Trigger custom APIs on run completion
  • Apify API โ€” Full programmatic access

This actor collects publicly available information from the npm registry. Use it responsibly and in compliance with npm's terms of service and all applicable laws. Maintainer names and emails are published openly by package authors as part of the registry; handle any personal data in line with applicable privacy regulations such as GDPR and CCPA, and only for legitimate purposes. You are responsible for how you use the data you collect.