NPM Package Scraper
Pricing
Pay per usage
NPM Package Scraper
Scrape NPM package data. Extract name, version, downloads, dependencies, maintainers, and more. Export to JSON, CSV, Excel.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
Glass Ventures
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share
Scrape detailed package data from the NPM registry. Extract name, version, downloads, dependencies, maintainers, license, and more from any NPM package.
What does NPM Package Scraper do?
NPM Package Scraper extracts comprehensive metadata from the NPM registry for any package. It uses the public NPM registry API to gather package details including version history, download statistics, dependency counts, maintainer information, and more.
This actor is ideal for developers, researchers, and analysts who need bulk NPM package data without writing custom API integration code. It handles pagination, rate limiting, and data normalization automatically.
Whether you need to audit dependencies across your organization, research competing packages, or build a dataset of packages in a specific domain, this actor delivers clean, structured data ready for analysis.
Use Cases
- DevOps engineers -- audit dependencies across projects, track package versions and licenses
- Market researchers -- analyze NPM ecosystem trends, compare package popularity and adoption
- Security analysts -- monitor packages for maintainer changes, dependency counts, and update frequency
- Developers -- discover and compare packages by search terms, evaluate alternatives
Features
- Scrape packages by direct URL, package name, or search query
- Extract download statistics, dependency counts, and maintainer info
- Optional README content extraction
- Proxy support with automatic rotation
- Handles pagination for search results automatically
- Exports to JSON, CSV, Excel, or connect via API
How much will it cost?
The NPM registry API is fully public and lightweight. This actor uses minimal compute resources.
| Results | Estimated Cost |
|---|---|
| 100 | ~$0.01 |
| 1,000 | ~$0.05 |
| 10,000 | ~$0.40 |
| Cost Component | Per 1,000 Results |
|---|---|
| Platform compute | ~$0.05 |
| Proxy (optional) | ~$0.00 |
| Total | ~$0.05 |
How to use
- Go to the NPM Package Scraper page on Apify Store
- Click "Start" or "Try for free"
- Enter package names, NPM URLs, or search terms
- Set the maximum number of items
- Click "Start" and wait for the results
Input parameters
| Parameter | Type | Description | Default |
|---|---|---|---|
| startUrls | array | NPM package page URLs to scrape | - |
| packageNames | array | Package names to look up directly | - |
| searchTerms | array | Search queries to find packages | - |
| includeReadme | boolean | Include full README in output | false |
| maxItems | number | Max results to return | 100 |
| proxyConfig | object | Proxy settings | Apify Proxy |
Output
The actor produces a dataset with the following fields:
{"url": "https://www.npmjs.com/package/express","name": "express","version": "4.18.2","description": "Fast, unopinionated, minimalist web framework for node.","author": "TJ Holowaychuk","license": "MIT","homepage": "http://expressjs.com/","repository": "https://github.com/expressjs/express","keywords": ["express", "framework", "sinatra", "web", "http"],"dependenciesCount": 31,"weeklyDownloads": 30000000,"lastPublished": "2023-10-11T17:00:00.000Z","maintainers": ["dougwilson", "linusu", "wesleytodd"],"readme": null,"scrapedAt": "2024-01-15T10:30:00.000Z"}
| Field | Type | Description |
|---|---|---|
| url | string | NPM package page URL |
| name | string | Package name |
| version | string | Latest version |
| description | string | Package description |
| author | string | Package author |
| license | string | License type |
| homepage | string | Project homepage URL |
| repository | string | Source repository URL |
| keywords | array | Package keywords |
| dependenciesCount | integer | Number of runtime dependencies |
| weeklyDownloads | integer | Downloads in the last month (weekly average) |
| lastPublished | string | ISO 8601 date of last publish |
| maintainers | array | List of maintainer usernames |
| readme | string | Full README content (if enabled) |
| scrapedAt | string | ISO 8601 scrape timestamp |
Integrations
Connect NPM Package Scraper with other tools:
- Apify API -- REST API for programmatic access
- Webhooks -- get notified when a run finishes
- Zapier / Make -- connect to 5,000+ apps
- Google Sheets -- export directly to spreadsheets
API Example (Node.js)
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_TOKEN' });const run = await client.actor('YOUR_USERNAME/npm-package-scraper').call({packageNames: ['express', 'react', 'lodash'],maxItems: 100,});const { items } = await client.dataset(run.defaultDatasetId).listItems();
API Example (Python)
from apify_client import ApifyClientclient = ApifyClient('YOUR_TOKEN')run = client.actor('YOUR_USERNAME/npm-package-scraper').call(run_input={'packageNames': ['express', 'react', 'lodash'],'maxItems': 100,})items = client.dataset(run['defaultDatasetId']).list_items().items
API Example (cURL)
curl "https://api.apify.com/v2/acts/YOUR_USERNAME~npm-package-scraper/runs" \-X POST \-H "Content-Type: application/json" \-H "Authorization: Bearer YOUR_TOKEN" \-d '{"packageNames": ["express", "react", "lodash"], "maxItems": 100}'
Tips and tricks
- Start with a small
maxItems(10-20) to test before running large scrapes - Use
packageNamesfor known packages -- it is faster than search - Search terms return up to 250 results per query from the NPM API
- The NPM registry is public and rate-limits are generous, so proxies are optional
FAQ
Q: Does this actor require login credentials? A: No. The NPM registry API is fully public and requires no authentication.
Q: How fast is the scraping? A: Approximately 100-300 packages per minute depending on concurrency settings.
Q: What should I do if I get rate limited? A: Enable proxy configuration in the settings. NPM rarely rate-limits, but proxies help for very large scrapes.
Q: Can I get download counts?
A: Yes. The actor fetches monthly download counts from the NPM downloads API and includes them as weeklyDownloads.
Is it legal to scrape NPM?
The NPM registry API is a public API designed for programmatic access. This actor only accesses publicly available package metadata through the official API endpoints. Always review and respect NPM's Terms of Service. For more information, see Apify's blog on web scraping legality.
Limitations
- Download counts are monthly totals from the NPM downloads API (not real-time)
- Search results are limited to 250 per search term (NPM API limit)
- Private/scoped packages that require authentication are not supported
- README content can be large and significantly increases output size
Changelog
- v0.1 (2026-04-23) -- Initial release