NPM Package Scraper avatar

NPM Package Scraper

Pricing

Pay per usage

Go to Apify Store
NPM Package Scraper

NPM Package Scraper

Scrape NPM package data. Extract name, version, downloads, dependencies, maintainers, and more. Export to JSON, CSV, Excel.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Glass Ventures

Glass Ventures

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a day ago

Last modified

Categories

Share

Scrape detailed package data from the NPM registry. Extract name, version, downloads, dependencies, maintainers, license, and more from any NPM package.

What does NPM Package Scraper do?

NPM Package Scraper extracts comprehensive metadata from the NPM registry for any package. It uses the public NPM registry API to gather package details including version history, download statistics, dependency counts, maintainer information, and more.

This actor is ideal for developers, researchers, and analysts who need bulk NPM package data without writing custom API integration code. It handles pagination, rate limiting, and data normalization automatically.

Whether you need to audit dependencies across your organization, research competing packages, or build a dataset of packages in a specific domain, this actor delivers clean, structured data ready for analysis.

Use Cases

  • DevOps engineers -- audit dependencies across projects, track package versions and licenses
  • Market researchers -- analyze NPM ecosystem trends, compare package popularity and adoption
  • Security analysts -- monitor packages for maintainer changes, dependency counts, and update frequency
  • Developers -- discover and compare packages by search terms, evaluate alternatives

Features

  • Scrape packages by direct URL, package name, or search query
  • Extract download statistics, dependency counts, and maintainer info
  • Optional README content extraction
  • Proxy support with automatic rotation
  • Handles pagination for search results automatically
  • Exports to JSON, CSV, Excel, or connect via API

How much will it cost?

The NPM registry API is fully public and lightweight. This actor uses minimal compute resources.

ResultsEstimated Cost
100~$0.01
1,000~$0.05
10,000~$0.40
Cost ComponentPer 1,000 Results
Platform compute~$0.05
Proxy (optional)~$0.00
Total~$0.05

How to use

  1. Go to the NPM Package Scraper page on Apify Store
  2. Click "Start" or "Try for free"
  3. Enter package names, NPM URLs, or search terms
  4. Set the maximum number of items
  5. Click "Start" and wait for the results

Input parameters

ParameterTypeDescriptionDefault
startUrlsarrayNPM package page URLs to scrape-
packageNamesarrayPackage names to look up directly-
searchTermsarraySearch queries to find packages-
includeReadmebooleanInclude full README in outputfalse
maxItemsnumberMax results to return100
proxyConfigobjectProxy settingsApify Proxy

Output

The actor produces a dataset with the following fields:

{
"url": "https://www.npmjs.com/package/express",
"name": "express",
"version": "4.18.2",
"description": "Fast, unopinionated, minimalist web framework for node.",
"author": "TJ Holowaychuk",
"license": "MIT",
"homepage": "http://expressjs.com/",
"repository": "https://github.com/expressjs/express",
"keywords": ["express", "framework", "sinatra", "web", "http"],
"dependenciesCount": 31,
"weeklyDownloads": 30000000,
"lastPublished": "2023-10-11T17:00:00.000Z",
"maintainers": ["dougwilson", "linusu", "wesleytodd"],
"readme": null,
"scrapedAt": "2024-01-15T10:30:00.000Z"
}
FieldTypeDescription
urlstringNPM package page URL
namestringPackage name
versionstringLatest version
descriptionstringPackage description
authorstringPackage author
licensestringLicense type
homepagestringProject homepage URL
repositorystringSource repository URL
keywordsarrayPackage keywords
dependenciesCountintegerNumber of runtime dependencies
weeklyDownloadsintegerDownloads in the last month (weekly average)
lastPublishedstringISO 8601 date of last publish
maintainersarrayList of maintainer usernames
readmestringFull README content (if enabled)
scrapedAtstringISO 8601 scrape timestamp

Integrations

Connect NPM Package Scraper with other tools:

  • Apify API -- REST API for programmatic access
  • Webhooks -- get notified when a run finishes
  • Zapier / Make -- connect to 5,000+ apps
  • Google Sheets -- export directly to spreadsheets

API Example (Node.js)

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_TOKEN' });
const run = await client.actor('YOUR_USERNAME/npm-package-scraper').call({
packageNames: ['express', 'react', 'lodash'],
maxItems: 100,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();

API Example (Python)

from apify_client import ApifyClient
client = ApifyClient('YOUR_TOKEN')
run = client.actor('YOUR_USERNAME/npm-package-scraper').call(run_input={
'packageNames': ['express', 'react', 'lodash'],
'maxItems': 100,
})
items = client.dataset(run['defaultDatasetId']).list_items().items

API Example (cURL)

curl "https://api.apify.com/v2/acts/YOUR_USERNAME~npm-package-scraper/runs" \
-X POST \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_TOKEN" \
-d '{"packageNames": ["express", "react", "lodash"], "maxItems": 100}'

Tips and tricks

  • Start with a small maxItems (10-20) to test before running large scrapes
  • Use packageNames for known packages -- it is faster than search
  • Search terms return up to 250 results per query from the NPM API
  • The NPM registry is public and rate-limits are generous, so proxies are optional

FAQ

Q: Does this actor require login credentials? A: No. The NPM registry API is fully public and requires no authentication.

Q: How fast is the scraping? A: Approximately 100-300 packages per minute depending on concurrency settings.

Q: What should I do if I get rate limited? A: Enable proxy configuration in the settings. NPM rarely rate-limits, but proxies help for very large scrapes.

Q: Can I get download counts? A: Yes. The actor fetches monthly download counts from the NPM downloads API and includes them as weeklyDownloads.

The NPM registry API is a public API designed for programmatic access. This actor only accesses publicly available package metadata through the official API endpoints. Always review and respect NPM's Terms of Service. For more information, see Apify's blog on web scraping legality.

Limitations

  • Download counts are monthly totals from the NPM downloads API (not real-time)
  • Search results are limited to 250 per search term (NPM API limit)
  • Private/scoped packages that require authentication are not supported
  • README content can be large and significantly increases output size

Changelog

  • v0.1 (2026-04-23) -- Initial release