NPM Package Scraper — Downloads, Maintainers, Deps & SBOM avatar

NPM Package Scraper — Downloads, Maintainers, Deps & SBOM

Pricing

from $2.00 / 1,000 results

Go to Apify Store
NPM Package Scraper — Downloads, Maintainers, Deps & SBOM

NPM Package Scraper — Downloads, Maintainers, Deps & SBOM

Export every NPM package by keyword, maintainer, scope or name. Get version, license, repo URL, maintainers, daily/weekly/monthly downloads, dependents, deprecation, full deps, version history. Official NPM registry + stats API. For devtool intel, SBOM and OSS outreach.

Pricing

from $2.00 / 1,000 results

Rating

0.0

(0)

Developer

Logiover

Logiover

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a day ago

Last modified

Share

NPM Package Scraper — Search, Downloads, Maintainers, Dependents, Versions & SBOM Data

Discover and export every NPM package matching a keyword, maintainer, scope or direct name. For each package: name, version, description, keywords, license, repository URL, homepage, npm URL, full maintainers list, weekly + monthly + yearly download counts, dependents count, deprecation status, runtime + dev + peer dependencies, file engines, exports map, and full version history with publish dates.

Built on the official open NPM Registry API + npmjs downloads stats API — no token, no proxy, no scraping.

Perfect for developer-tool competitive intelligence, dependency security & SBOM workflows, open-source maintainer outreach, sponsorship targeting, package marketplace seeding, and any 2026 JavaScript / Node.js ecosystem analytics.


🚀 What does this NPM scraper do?

Two complementary input modes — combine them or use one alone:

ModeWhen to use
Search TermsDiscover packages matching a keyword, scope, maintainer or keyword tag — paginated up to thousands of results per query
Package NamesYou already know which packages — pass exact names (including scoped @org/pkg)

For each package, the actor combines three sources in one record:

  1. Search snippet — search score, dependents count, basic downloads
  2. Registry full doc (registry.npmjs.org/{pkg}) — full metadata, all versions, dist info, dependencies, maintainers
  3. Downloads stats (api.npmjs.org/downloads/*) — last-day / week / month / year point counts + optional daily history series

💡 Use cases

  • Devtool competitive intel — find every JS package mentioning react + chart and rank by weekly downloads
  • OSS sponsorship outreach — pull every popular package's maintainers + GitHub repository URL → CRM
  • Dependency security / SBOM — feed package metadata + dependency tree into a vulnerability monitor
  • Package marketplace seeding — discover every Apache-2.0 / MIT package in a niche above 10k downloads
  • Maintainer talent intel — find every maintainer of @apify/* packages for recruiting / partnership
  • Trend monitoring — track weekly download deltas with fetchDownloadsHistory to surface fast-growing packages
  • Migration analysis — pull every package using a deprecated dep and prioritize outreach
  • AI training data — every package's README + description is high-quality NL data about JS APIs
  • OSS health dashboards — combine deprecation status + last-publish-date + maintainer count → "is this package alive?"

⚙️ Input configuration

FieldTypeDefaultDescription
searchTermsstring[][]NPM registry search queries (paginated 250/page).
packageNamesstring[][]Direct package names (incl. @scope/name).
maintainerstring""Search-only filter: maintainer:<username>.
scopestring""Search-only filter: scope:<scope> (without @).
keywordsstring[][]Search-only filter: keywords:<tag> for each entry.
maxResultsPerSearchinteger100Hard cap per search term.
fetchPackageDetailsbooleantruePull full registry doc per package — adds maintainers, dist-tags, repository, license, version history, dependencies, deprecation, file engines, exports.
fetchDownloadsStatsbooleantrueAdd last-day / week / month / year download counts via the npmjs stats API.
fetchDownloadsHistorybooleanfalsePull per-day downloads time series — adds downloadsHistory: [{day,downloads},…].
downloadsHistoryRangestring"last-month"Range for history (last-week / last-month / last-year or YYYY-MM-DD:YYYY-MM-DD).
fetchReadmebooleanfalseInclude the package's raw README markdown in each record (requires fetchPackageDetails).
minMonthlyDownloadsinteger0Client-side floor — drop packages below this monthly download count.
includeDeprecatedbooleantrueSet to false to skip packages marked deprecated by maintainers.
minVersionPublishDatestringnullDrop packages whose latest version was published before this date (YYYY-MM-DD).

📦 Output fields

FieldDescriptionExample
namePackage name"apify"
scope@scope (if scoped)"@apify"
versionLatest published version"3.7.2"
descriptionShort description"The scalable web crawling..."
keywordsAuthor-supplied keywords["headless","chrome","puppeteer"]
licenseSPDX license ID"Apache-2.0"
homepageProject homepage"https://docs.apify.com/sdk/js"
repositoryUrlSource repository URL"https://github.com/apify/apify-sdk-js"
repositoryTypegit, svn, etc."git"
bugsUrlBug tracker URL"https://github.com/.../issues"
npmUrlNPM page URL"https://www.npmjs.com/package/apify"
publishedAtLatest version publish date"2026-05-18T..."
createdAtOriginal creation date"2017-11-...":
lastModifiedLast modification"2026-05-18T..."
maintainersList of maintainer usernames["mtrunkat","jancurn"]
maintainersCountCount11
publisherUsernameLast publisher"GitHub Actions"
authorAuthor display"Apify Technologies s.r.o."
distTagsAll dist-tags{"latest":"3.7.2","beta":"3.8.0-beta.1"}
versionCountTotal published versions1082
versionHistoryAll versions + publish dates[{version,publishedAt}, …]
enginesNode/npm engine constraints{"node":">=18"}
exportsPackage exports map{...}
mainMain entry point"build/index.js"
type"module" / "commonjs""module"
dependenciesRuntime deps{...}
devDependenciesDev deps{...}
peerDependenciesPeer deps{...}
optionalDependenciesOptional deps{...}
dependencyCountRuntime dep count12
deprecatedDeprecation messagenull
isDeprecatedBoolfalse
tarballUrlDirect tarball"https://registry.npmjs.org/.../apify-3.7.2.tgz"
unpackedSizeUnpacked size in bytes1234567
fileCountFile count in tarball42
shasumTarball SHA-1"abc..."
integritySRI integrity hash"sha512-..."
dailyDownloadsLast-day downloads1247
weeklyDownloadsLast-week downloads44630
monthlyDownloadsLast-month downloads153258
yearlyDownloadsLast-year downloads1900000
downloadsHistoryPer-day series (optional)[{day:"2026-05-01",downloads:5200}, …]
dependentsCountPackages depending on this77
searchScoreNPM search relevance score1422.05
readmeFull README markdown (optional)"# Apify SDK\n…"
scrapedAtUTC scrape timestamp"2026-05-18T07:30:00Z"

🧪 Example inputs

1. Top scraping packages with full stats

{
"searchTerms": ["scraper", "crawler", "puppeteer"],
"maxResultsPerSearch": 50,
"fetchPackageDetails": true,
"fetchDownloadsStats": true,
"minMonthlyDownloads": 1000
}

2. All packages in the @apify scope

{
"searchTerms": ["scope:apify"],
"scope": "apify",
"maxResultsPerSearch": 500
}

3. Direct package list with downloads history

{
"packageNames": ["react", "vue", "svelte", "solid-js", "preact", "lit"],
"fetchPackageDetails": true,
"fetchDownloadsStats": true,
"fetchDownloadsHistory": true,
"downloadsHistoryRange": "last-year"
}

4. Fresh packages only (published in 2026)

{
"searchTerms": ["typescript starter"],
"maxResultsPerSearch": 200,
"fetchPackageDetails": true,
"minVersionPublishDate": "2026-01-01"
}

5. Healthy + non-deprecated React UI libraries

{
"searchTerms": ["react component library"],
"keywords": ["react", "ui"],
"maxResultsPerSearch": 200,
"includeDeprecated": false,
"minMonthlyDownloads": 10000
}

6. SBOM-style: full dependency map for a list of packages

{
"packageNames": ["next", "@types/react", "tailwindcss", "drizzle-orm"],
"fetchPackageDetails": true,
"fetchDownloadsStats": false,
"fetchReadme": true
}

7. OSS sponsorship outreach: maintainers of top scraping libs

{
"searchTerms": ["scraper", "crawler"],
"maxResultsPerSearch": 1000,
"fetchPackageDetails": true,
"fetchDownloadsStats": true,
"minMonthlyDownloads": 5000,
"includeDeprecated": false
}

🧠 How it works

  1. SearchGET https://registry.npmjs.org/-/v1/search?text=<query>&size=250&from=N paginates 250 packages per page.
  2. Per-package full docGET https://registry.npmjs.org/<name> returns every published version + repository + maintainers + license + time map.
  3. Downloads point statsGET https://api.npmjs.org/downloads/point/<range>/<name> for each of last-day, last-week, last-month, last-year.
  4. Downloads time seriesGET https://api.npmjs.org/downloads/range/<range>/<name> returns one entry per day.
  5. Dedup — names are tracked across search terms and direct lists to avoid double-processing.

No authentication. NPM's registry + stats APIs are public by design.


🛑 Limits & notes

  • NPM stats API rate limit: ~5 req/sec per IP under normal conditions. The actor uses sequential per-package enrichment with backoff retries.
  • dependentsCount in the search snippet is approximate; for exact dependents enumeration use the https://www.npmjs.com/browse/depended/{pkg} page (not yet supported — open a feature request).
  • readme can be large (some packages ship 50+ KB markdown). Disable for large runs unless needed.
  • Scoped private packages are not returned. Public scopes (@types/*, @apify/*, etc.) are fully accessible.
  • Search relevance score is NPM's internal ranking — useful for ordering but not directly comparable across queries.

💰 Pricing

Monetized via pay-per-event on Apify — pay per package record saved. NPM registry + stats APIs are free.


❓ FAQ

Does this work for private NPM registries (GitHub Packages, JFrog)? Not by default. The actor targets the public registry.npmjs.org. Open an issue to add an auth_token + custom registry URL feature.

Can I get exact lists of packages that depend on package X? Not yet — NPM no longer exposes a stable "dependents" API. The search snippet returns an approximate count. For full reverse-dependency enumeration consider GitHub code search.

Can I export to CSV / Excel? Yes — every Apify dataset can be exported in CSV, Excel, XML, JSONL, or RSS from the run page or via the API.

How is this different from npm-registry-fetch or pacote? Those are Node libraries to call the same endpoints from your own code. This actor adds: cross-source merging, pagination guardrails, normalized output, optional download history, deprecation/staleness filtering and Apify-native scheduling, webhooks, dataset views.

Does it cover PyPI / RubyGems / Cargo / Maven? No — JavaScript NPM only. Each ecosystem can be added as a sibling actor.

Does it download tarballs? No — only metadata. tarballUrl is provided for downstream fetching if needed.


  • logiover/github-repository-scraper — combine NPM repositoryUrl with GitHub for stars + contributors
  • logiover/website-contact-scraper — enrich homepage URLs with maintainer contact emails for outreach
  • logiover/sitemap-to-url-crawler — crawl each package's docs site for additional content
  • logiover/devto-articles-scraper — track how often each package is discussed on Dev.to

🆘 Support

Need PyPI, RubyGems, Cargo or Maven equivalent? Open an issue on the actor's Apify page.


Changelog

  • 2026-05-20 — Maintenance pass: reviewed the input schema and default values for a smooth one-click start, and rebuilt the Actor on the latest base image.

Last reviewed: 2026-05-20.