npm Registry Scraper - Search & Download Stats avatar

npm Registry Scraper - Search & Download Stats

Pricing

from $19.00 / 1,000 results

Go to Apify Store
npm Registry Scraper - Search & Download Stats

npm Registry Scraper - Search & Download Stats

Search and scrape npm package data including versions, descriptions, authors, licenses, keywords, and weekly/total download counts from the public npm registry API.

Pricing

from $19.00 / 1,000 results

Rating

0.0

(0)

Developer

ParseForge

ParseForge

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

19 hours ago

Last modified

Share

ParseForge Banner

📦 npm Registry Scraper

🚀 Export npm package data in seconds. Search 3,000,000+ packages by keyword, sort by popularity, quality, or maintenance, and get full metadata including versions, licenses, authors, keywords, and weekly/total download counts - no API key required.

🕒 Last updated: 2026-05-21 · 📊 14 fields per record · 📦 3M+ packages · 🌍 Public registry · 📈 Download stats included

The npm Registry Scraper exports package metadata directly from the public npm registry API and returns 14 fields per record, including package name, version, description, author, license, homepage, repository, keywords, weekly downloads, total downloads, last published date, and the direct npm URL. The underlying data comes straight from the official npm registry maintained by GitHub and is updated in real time.

The registry covers every published npm package, from flagship frameworks like React, Vue, and Express to niche utilities. This Actor delivers the complete metadata profile plus download statistics with a single run. No auth, no scraping workarounds - the registry API is fully public.

🎯 Target Audience💡 Primary Use Cases
Developers, data analysts, open-source researchers, security teams, marketing analysts, DevRel professionalsPackage discovery, dependency audits, popularity benchmarking, license compliance, competitive research, ecosystem mapping

📋 What the npm Registry Scraper does

Four data-collection workflows in a single run:

  • 🔍 Keyword search. Find packages matching any search term - framework names, function types, author names, or topic keywords.
  • 📈 Popularity ranking. Sort results by download count to surface the most widely used packages first.
  • 🏆 Quality ranking. Rank by code quality signals including test coverage, linting, and documentation completeness.
  • 🔧 Maintenance ranking. Surface actively maintained packages based on update frequency and issue responsiveness.

Each record includes the package name, latest version, description, publisher username, license type, homepage and repository links, keyword tags, weekly downloads, all-time total downloads, the last published timestamp, and the direct npmjs.com URL.

💡 Why it matters: the npm registry hosts over 3,000,000 packages and grows by hundreds every day. Manual browsing only surfaces trending content. This Actor lets you query any keyword, retrieve the full metadata profile for matching packages, and export the results to CSV, Excel, JSON, or XML in seconds - ideal for building package catalogs, auditing dependencies, or tracking ecosystem growth over time.


🎬 Full Demo

🚧 Coming soon: a 3-minute walkthrough showing how to go from sign-up to a downloaded package dataset.


⚙️ Input

InputTypeDefaultBehavior
searchstring"react"Search term to find npm packages (e.g. "typescript", "express", "date-fns").
sortBystring"popularity"Ranking method: optimal, quality, popularity, or maintenance.
maxItemsinteger10Number of packages to return. Free plan caps at 10, paid plan at 1,000,000.

Example: top 50 React-related packages by popularity.

{
"maxItems": 50,
"search": "react",
"sortBy": "popularity"
}

Example: best-maintained TypeScript utility libraries.

{
"maxItems": 100,
"search": "typescript utility",
"sortBy": "maintenance"
}

⚠️ Good to Know: the npm search API returns results based on the official registry index. Very new packages (published in the last few minutes) may not appear immediately. Download stats come from the npm downloads API and reflect real install counts including CI and automated installs - not just human-initiated downloads.


📊 Output

Each package record contains 14 fields. Download the dataset as CSV, Excel, JSON, or XML.

🧾 Schema

FieldTypeExample
📦 namestring"react"
🏷 versionstring"19.2.6"
📝 descriptionstring"React is a JavaScript library for building user interfaces."
👤 authorstring"react-bot"
📜 licensestring"MIT"
🌐 homepagestring"https://react.dev/"
🗄 repositorystring"git+https://github.com/facebook/react.git"
🏷 keywordsarray["react"]
📊 weeklyDownloadsinteger134135283
📈 totalDownloadsinteger4230521121
📅 lastPublishedstring (ISO 8601)"2026-05-06T16:16:47.653Z"
🔗 urlstring"https://www.npmjs.com/package/react"
🕒 scrapedAtstring (ISO 8601)"2026-05-21T22:56:54.151Z"
errorstring | nullnull

📋 Sample Records

Record 1 - react

{
"name": "react",
"version": "19.2.6",
"description": "React is a JavaScript library for building user interfaces.",
"author": "react-bot",
"license": "MIT",
"homepage": "https://react.dev/",
"repository": "git+https://github.com/facebook/react.git",
"keywords": ["react"],
"weeklyDownloads": 134135283,
"totalDownloads": 4230521121,
"lastPublished": "2026-05-06T16:16:47.653Z",
"url": "https://www.npmjs.com/package/react",
"scrapedAt": "2026-05-21T22:56:54.151Z"
}

Record 2 - react-router

{
"name": "react-router",
"version": "7.15.1",
"description": "Declarative routing for React",
"author": "GitHub Actions",
"license": "MIT",
"homepage": "https://github.com/remix-run/react-router#readme",
"repository": "git+https://github.com/remix-run/react-router.git",
"keywords": ["react", "router", "route", "routing", "history", "link"],
"weeklyDownloads": 50265371,
"totalDownloads": 1661085982,
"lastPublished": "2026-05-14T14:40:53.242Z",
"url": "https://www.npmjs.com/package/react-router",
"scrapedAt": "2026-05-21T22:56:54.151Z"
}

✨ Why choose this Actor

FeatureBenefit
📡 Direct registry APIData straight from npm - no intermediaries, no stale cache
📊 Download statisticsWeekly and all-time totals in every record
🔀 Multiple sort modespopularity, quality, maintenance, or optimal ranking
🏷 License fieldEvery record includes the SPDX license identifier
📦 Keywords arrayFull tag list for downstream filtering and categorization
⚡ Fast executionTypical 100-item run completes in under 10 seconds
🌐 No auth requiredZero credentials - the npm registry is fully public
💾 Multi-format exportCSV, Excel, JSON, XML out of the box

📈 How it compares to alternatives

Methodnpm Registry ScraperManual browsingnpm CLI (npm search)Custom script
Export to CSV/ExcelRequires work
Download statistics❌ partialMultiple API calls
Pagination support✅ up to 1M❌ limitedRequires work
Runs in the cloudN/AN/A
Multiple sort modes❌ limitedRequires work
No setup requiredRequires Node.jsRequires Node.js

🚀 How to use

  1. Create a free account w/ $5 credit on Apify (no credit card required for the trial).
  2. Open the npm Registry Scraper on the Apify Store.
  3. Enter your search term and select a sort mode.
  4. Set maxItems to the number of packages you need.
  5. Click Run - results appear in the Dataset tab within seconds.
  6. Click Export to download as CSV, Excel, JSON, or XML.

💼 Business use cases

🔐 Security and license compliance

Audit dependencies across a stack. Search for packages used in your projects and extract license fields in bulk. Flag packages with non-permissive licenses (GPL, AGPL) before they reach production. Combine with total download counts to prioritize which packages to review first.

📊 Competitive and market intelligence

Track the most popular packages in a given category (e.g. "date manipulation", "HTTP client", "state management"). Monitor weekly download trends to identify rising challengers or declining incumbents. Export to Excel and build charts showing ecosystem momentum.

🛠 Developer tooling and cataloging

Build internal package catalogs for engineering teams. Index packages by keyword, license, and author to power internal search tools. Use the repository field to cross-reference with GitHub metrics for combined scoring.

🎓 Research and education

Map the growth of JavaScript ecosystems over time. Identify the most downloaded packages per category for curriculum design. Analyze keyword co-occurrence patterns across package descriptions to understand how developers categorize their work.


🔌 Automating npm Registry Scraper

Connect this Actor to your existing tools with no code required:

  • Make (Integromat): Trigger a run on a schedule, then push results to Google Sheets, Airtable, or a database.
  • Zapier: Start a run when a GitHub issue is created (e.g. "evaluate package X"), then post results to Slack.
  • Slack: Receive a formatted weekly digest of the most downloaded packages in your tech stack.
  • Google Sheets: Append new packages to a tracking spreadsheet automatically each week.
  • Webhooks: Fire a POST to your own endpoint when a run completes, with the dataset URL in the payload.

🌟 Beyond business use cases

🔬 Open-source research

Measure the growth rate of package categories over time. Which testing frameworks are gaining ground? Which bundlers are losing traction? Weekly download deltas tell the story quantitatively.

🗺 Ecosystem mapping

Cross-reference npm keywords with GitHub topics and Stack Overflow tags to build a unified map of JavaScript sub-ecosystems. Identify gaps where popular problem domains lack well-maintained open-source solutions.

🎨 Creative exploration

Generate word clouds from keyword arrays across thousands of packages. Visualize the language developers use to describe their work - from "utility" and "parser" to "animation" and "cli".

🤝 Non-profit and education

Build curated package lists for coding bootcamps or open curricula. Identify accessible, well-documented libraries appropriate for learners at different skill levels, using quality scores and weekly downloads as proxies.


🤖 Ask an AI assistant about this scraper

Not sure which search terms to use? Paste a sample record into ChatGPT or Claude and ask:

  • "Which of these packages would best fit a Next.js project?"
  • "Flag any packages with non-permissive licenses."
  • "Which packages have declining downloads and may be deprecated?"
  • "Group these packages by their primary use case."

The 14-field output gives AI assistants enough context to reason about packages, licenses, and ecosystem positioning without any additional lookup.


❓ Frequently Asked Questions

📦 Where does the data come from? Directly from the official npm registry API at registry.npmjs.org and download stats from api.npmjs.org. No scraping, no third-party intermediaries.

🔑 Do I need an API key? No. The npm registry is fully public and requires no authentication.

📊 What does "weeklyDownloads" count? The number of times the package tarball was downloaded from the npm CDN in the last 7 days. This includes CI/CD pipelines, automated installs, and human-initiated npm install commands.

📈 What does "totalDownloads" count? All downloads since approximately 2010 (when npm started recording stats) through the current date. The exact range is 2010-01-01 to 2099-12-31 as reported by the npm downloads API.

🔀 What is the difference between sort modes? popularity ranks by raw download volume. quality uses signals like test coverage and documentation. maintenance considers commit frequency and issue close rate. optimal is npm's blended default that combines all three.

📦 How many packages can I get per search? Up to 1,000,000 on a paid plan. The scraper paginates through the registry API automatically. Free plan is capped at 10 packages per run.

🔍 What happens if my search term returns no results? The Actor logs a warning and exits cleanly with zero records. No error is pushed to the dataset.

🏷 What is in the keywords field? An array of tags the package author added in package.json. Some packages have no keywords and the field will be null.

🕒 How fresh is the data? The npm registry API is updated in real time. Every run fetches live data - there is no caching on our end.

💰 Is there a free tier? Yes. Free users get 10 packages per run. Upgrade to a paid plan to unlock up to 1,000,000 packages per run.


🔌 Integrate with any app

This Actor connects natively to every tool in the Apify ecosystem:

IntegrationHow
Google SheetsUse the Apify Google Sheets integration to write rows automatically
AirtablePush JSON records to a base via the Apify Airtable integration
Make (Integromat)Trigger runs and route results with the Apify Make module
ZapierStart runs and parse datasets with the Apify Zapier app
SlackPost dataset summaries to channels via webhook
REST APIGET /v2/datasets/{id}/items for direct programmatic access
Python / Node.jsUse the official Apify client SDK to run and read results
WebhooksConfigure a POST callback on run completion

ActorDescription
parseforge/producthunt-scraperScrape Product Hunt launches, upvotes, makers, and comments
parseforge/upwork-scraperExtract Upwork job listings with skills, budget, and client data
parseforge/remoteok-scraperCollect remote job postings with tech stack and salary ranges
parseforge/openfda-scraperAccess FDA drug, device, and food safety event data
parseforge/wipo-brand-database-scraperSearch international trademark registrations from WIPO

💡 Pro Tip: browse the complete ParseForge collection for 50+ production-ready data scrapers covering developer tools, job boards, public registries, and more.


This Actor uses the public npm registry API. All data is sourced from npmjs.com and api.npmjs.org. ParseForge is not affiliated with npm, Inc. or GitHub. Use responsibly and in accordance with npm's terms of service.