npm Registry Scraper
Pricing
from $2.50 / 1,000 results
npm Registry Scraper
[๐ฐ $2.5 / 1K] Search the public npm registry or look up exact packages, and extract package metadata, version history, maintainers, repository and homepage links, dependencies, and download statistics (day/week/month).
Pricing
from $2.50 / 1,000 results
Rating
0.0
(0)
Developer
SolidCode
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
Pull structured package data from the public npm registry at scale โ names, versions, maintainers, repository links, license, dependencies, and day / week / month download counts for every package you search or name directly. Search by free-text query (with npm's own qualifiers) or fetch exact packages including scoped @org/name, and get back a clean, consistent dataset every time. Built for developers auditing dependencies, security and supply-chain teams vetting packages, and market-research and data teams who need a reliable npm dataset without stitching together the registry's scattered endpoints themselves.
Why This Scraper?
- 31-field flat schema, every key always present โ one tidy row per package with explicit
nullwhen a value is missing, so your CSV, Sheets, and database columns never shift between runs. - Two modes, no technical toggle โ type a query to search, or paste exact package names to look them up directly. Scoped packages like
@types/nodeand@angular/coreare fully supported; the mode is inferred for you. - npm search qualifiers built in โ use
author:sindresorhus,keywords:cli, oris:unstableright inside the search box to pull a maintainer's whole portfolio or a single keyword's ecosystem. - Day, week, and month download counts together โ all three windows on every row, not a single period you have to choose up front.
- Dependencies, devDependencies AND peerDependencies โ the latest manifest's full runtime, dev, and peer dependency lists, something most npm scrapers skip entirely.
- Full version history with first-publish dates โ the complete time-ordered list of every published version plus the package's original
createdAttimestamp, on demand. - SPDX license and weekly-download filtering โ keep only packages under licenses you allow (
MIT,Apache-2.0,ISC) and above a weekly-downloads floor, so abandoned packages drop out before they reach your dataset. - Multi-keyword search, deduped โ pass several keyword tags and get a single merged, duplicate-free result set ranked by relevance or re-sorted by most downloaded.
- Up to 10,000 packages per run โ fetch a fixed count or set
0to walk an entire search to completion under a built-in safety cap.
Use Cases
Dependency Auditing
- Snapshot the exact versions, licenses, and maintainers behind every package in a project's tree
- Compare runtime, dev, and peer dependencies across candidate libraries before adopting one
- Track first-publish and last-modified dates to spot stale or recently revived packages
- Build an internal allowlist of packages that meet your license and download thresholds
Supply-Chain Security
- Vet new dependencies by maintainer, publisher, and version cadence before they reach production
- Filter out low-traffic packages below a weekly-downloads floor that are easier targets for takeover
- Pull the full version history to review release frequency and sudden ownership changes
- Cross-reference repository, homepage, and issue-tracker URLs to confirm a package's provenance
Market & Ecosystem Research
- Map an entire ecosystem with
keywords:qualifiers โ every CLI, testing, or TypeScript package - Rank a topic's packages by weekly downloads to find the real leaders versus the long tail
- Monitor adoption trends by re-running searches and comparing download counts over time
- Profile a framework's surrounding tooling by license, maintainer, and dependency footprint
Developer Relations & Lead Generation
- Pull a single maintainer's entire published portfolio with the
author:qualifier - Build contact lists of package maintainers (name and email) for outreach and partnerships
- Identify the most-downloaded packages in your category to find integration and sponsorship targets
- Track which authors dominate a keyword to focus community and DevRel efforts
Dataset Building
- Generate clean, consistent package datasets for dashboards, internal tools, and ML pipelines
- Export every field as JSON, CSV, or Excel with stable columns that never drift between runs
- Enrich an existing package list by looking up exact names in bulk, scoped names included
- Feed npm metadata into analytics warehouses with predictable one-row-per-package billing
Getting Started
Simple search
{"searchQuery": "react"}
Filtered search
{"searchQuery": "http client","licenses": ["MIT"],"minDownloadsLastWeek": 100000,"sortBy": "downloads","maxResults": 100}
Exact lookup with full detail
{"packageNames": ["lodash", "express", "@types/node", "@angular/core"],"includeVersions": true,"includeDependencies": true}
Input Reference
What to Fetch
| Parameter | Type | Default | Description |
|---|---|---|---|
searchQuery | string | "react" | Free-text search across names, descriptions, and keywords. Supports npm qualifiers like keywords:cli, author:sindresorhus, and is:unstable. Leave empty if you only want exact package lookups. |
packageNames | array | [] | Fetch specific packages directly, including scoped names like @types/node. When provided, the search query and search filters are ignored. |
Filters (search mode only)
| Parameter | Type | Default | Description |
|---|---|---|---|
keywords | array | [] | Keep only packages tagged with at least one of these npm keywords. Multiple keywords are merged and deduplicated. |
licenses | array | [] | Keep only packages published under at least one of these SPDX license identifiers (MIT, Apache-2.0, ISC). |
minDownloadsLastWeek | integer | null | Keep only packages with at least this many downloads in the last week โ a quick way to drop abandoned or rarely used packages. |
sortBy | string | "Best match" | Best match keeps npm's own relevance ranking; Most downloaded re-orders the packages you fetch with the highest weekly downloads first. |
What to Include
| Parameter | Type | Default | Description |
|---|---|---|---|
includeVersions | boolean | false | Add the complete, time-ordered list of every published version with release dates, plus the package's first-publish date. |
includeMaintainers | boolean | true | Add the maintainer list (name and email) to each package. |
includeDownloads | boolean | true | Add download counts for the last day, last week, and last month. |
includeDependencies | boolean | false | Add the latest version's runtime, dev, and peer dependency lists. |
Results
| Parameter | Type | Default | Description |
|---|---|---|---|
maxResults | integer | 50 | The most packages to return for a search. Set to 0 to fetch every match, with a safety cap of 10,000. Ignored when fetching exact package names. |
Output
Every package is returned as one flat row. Below is a representative result (download statistics, versions, and dependencies shown with all optional toggles on).
{"name": "express","latestVersion": "4.21.2","distTags": { "latest": "4.21.2", "next": "5.0.0" },"description": "Fast, unopinionated, minimalist web framework","keywords": ["express", "framework", "web", "rest", "router"],"license": "MIT","author": { "name": "TJ Holowaychuk", "email": "tj@vision-media.ca", "url": null },"publisher": "wesleytodd","maintainers": [{ "name": "wesleytodd", "email": "wes@example.com" }],"homepageUrl": "http://expressjs.com/","repositoryUrl": "https://github.com/expressjs/express","bugsUrl": "https://github.com/expressjs/express/issues","npmUrl": "https://www.npmjs.com/package/express","createdAt": "2010-12-29T19:38:25.450Z","modifiedAt": "2024-12-09T17:21:04.012Z","versionCount": 281,"versions": [{ "version": "4.21.2", "publishedAt": "2024-12-09T17:21:03.000Z" }],"dependencies": { "accepts": "~1.3.8", "body-parser": "1.20.3" },"devDependencies": { "mocha": "10.7.3", "supertest": "6.3.0" },"peerDependencies": {},"downloadsLastDay": 4218903,"downloadsLastWeek": 28734512,"downloadsLastMonth": 121908447,"scoreFinal": 0.91,"scoreQuality": 0.95,"scorePopularity": 0.99,"scoreMaintenance": 0.78,"searchScore": 100023.45,"dependentsCount": 84213,"recordType": "package","scrapedAt": "2026-06-13T10:42:00.000Z"}
Core Fields
| Field | Type | Description |
|---|---|---|
name | string | Package name, including scope (e.g. @types/node). |
latestVersion | string | Version published under the latest dist-tag. |
distTags | object | All distribution tags mapped to their version (latest, next, beta). |
description | string | Short package description. |
keywords | array | Keyword tags declared by the package. |
license | string | SPDX license identifier. |
recordType | string | Always "package". |
scrapedAt | string | ISO 8601 timestamp of when the row was collected. |
Links
| Field | Type | Description |
|---|---|---|
homepageUrl | string | Package homepage. |
repositoryUrl | string | Source repository URL, normalized to a browsable link. |
bugsUrl | string | Issue tracker URL. |
npmUrl | string | Canonical npm page for the package. |
People
| Field | Type | Description |
|---|---|---|
author | object | Declared author { name, email, url }. |
publisher | string | Username that published the indexed version. |
maintainers | array | List of maintainers { name, email } (when maintainers are included). |
Timestamps & Versions
| Field | Type | Description |
|---|---|---|
createdAt | string | First-publish timestamp of the package (when versions are included). |
modifiedAt | string | Last-modified timestamp (when versions are included). |
versionCount | integer | Total number of published versions. |
versions | array | Time-ordered { version, publishedAt } list (when versions are included). |
Downloads & Scores
| Field | Type | Description |
|---|---|---|
downloadsLastDay | integer | Downloads in the last day (when downloads are included). |
downloadsLastWeek | integer | Downloads in the last week. |
downloadsLastMonth | integer | Downloads in the last month. |
scoreFinal | number | npm blended search score (search mode). |
scoreQuality | number | Quality sub-score. |
scorePopularity | number | Popularity sub-score. |
scoreMaintenance | number | Maintenance sub-score. |
searchScore | number | npm search relevance score (search mode). |
dependentsCount | integer | Number of packages that depend on this one, when reported. |
Dependencies
| Field | Type | Description |
|---|---|---|
dependencies | object | Runtime dependencies of the latest version (when dependencies are included). |
devDependencies | object | Development dependencies of the latest version. |
peerDependencies | object | Peer dependencies of the latest version. |
Tips for Best Results
- Use the
author:qualifier in the search box to pull a maintainer's entire published portfolio in one run โ for exampleauthor:sindresorhus. - List scoped packages exactly as
@scope/namein Exact package names โ@babel/core,@types/node,@angular/coreโ and the search box is ignored for those runs. - Turn on Include full version history only when you need release timelines or the first-publish date; it pulls each package's full document and adds the most weight to a run.
- Set a Minimum weekly downloads floor (try 10,000 or 100,000) to cut abandoned packages out of broad searches before they reach your dataset.
- Very high download thresholds or a rare license scan many packages to find enough matches, so a run can take a few minutes; pair them with a specific search query (rather than a broad one) to keep runs fast.
- Choose Most downloaded sorting to surface the real leaders in a category; it re-orders the packages you fetch by weekly downloads.
- Combine multiple Keyword tags to map a whole ecosystem at once โ results are merged and de-duplicated automatically.
- Set Maximum packages to
0to walk an entire search to completion, bounded by the 10,000-package safety cap.
Pricing
From $2.50 per 1,000 results โ undercuts comparable npm registry actors while returning a richer, fully consistent schema. Bronze, Silver, and Gold subscribers pay progressively less; the table below shows total cost at each discount tier.
| Results | No discount | Bronze | Silver | Gold |
|---|---|---|---|---|
| 100 | $0.30 | $0.28 | $0.27 | $0.25 |
| 1,000 | $3.00 | $2.80 | $2.65 | $2.50 |
| 10,000 | $30.00 | $28.00 | $26.50 | $25.00 |
| 100,000 | $300.00 | $280.00 | $265.00 | $250.00 |
One result is one package row in your dataset. No compute or time-based charges โ you pay per result, plus a small fixed per-run start fee.
Integrations
Export data in JSON, CSV, Excel, XML, or RSS. Connect to 1,500+ apps via:
- Zapier / Make / n8n โ Workflow automation
- Google Sheets โ Direct spreadsheet export
- Slack / Email โ Notifications on new results
- Webhooks โ Trigger custom APIs on run completion
- Apify API โ Full programmatic access
Legal & Ethical Use
This actor collects publicly available information from the npm registry. Use it responsibly and in compliance with npm's terms of service and all applicable laws. Maintainer names and emails are published openly by package authors as part of the registry; handle any personal data in line with applicable privacy regulations such as GDPR and CCPA, and only for legitimate purposes. You are responsible for how you use the data you collect.