NPM Scraper avatar

NPM Scraper

Try for free

3 days trial then $10.00/month - No credit card required now

Go to Store
NPM Scraper

NPM Scraper

epctex/npm-scraper
Try for free

3 days trial then $10.00/month - No credit card required now

One of the most crucial JavaScript communities is at your fingertips. Extract all the package information right away from NPM. Title, maintainers, readme, downloads per version, dependent libraries, and many other information can be retrieved directly! No limits! Get JSON, Excel, XML, and more!

Actor - NPM Scraper

NPM scraper

Since NPM doesn't provide a good and free API, this actor should help you to retrieve data from it.

The NPM data scraper supports the following features:

  • Search any keyword - You can search for any keyword and retrieve all the packages that you are seeking. This feature also supports special search syntax on the search such as keyword:

  • Scrape package detail - Extremely detailed and structured data for any package at your service. Get versions, names, maintainers, all the historical data, and much more.

  • Get packages of a user - If you are looking for all the packages of a user, that is a one-stop shop for you!

Bugs, fixes, updates, and changelog

This scraper is under active development. If you have any feature requests you can create an issue from here.

Input Parameters

The input of this scraper should be JSON containing the list of pages on NPM that should be visited. Possible fields are:

  • search: (Optional) (String) Keyword that you want to search on npmjs.

  • startUrls: (Optional) (Array) List of URLs in NPM. You should only provide search, list, user detail, or package detail URLs.

  • endPage: (Optional) (Number) Final number of page that you want to scrape. The default is Infinite. This applies to all search requests and startUrls individually.

  • maxItems: (Optional) (Number) You can limit scraped items. This should be useful when you search through the big lists or search results.

  • proxy: (Required) (Proxy Object) Proxy configuration.

  • customMapFunction: (Optional) (String) Function that takes each object's handle as an argument and returns the object with executing the function.

This solution requires the use of Proxy servers, either your own proxy servers or you can use Apify Proxy.

Tip

When you want to scrape over a specific list URL, just copy and paste the link as one of the startUrl.

If you would like to scrape only the first page of a list then put the link for the page and have the endPage as 1.

With the last approach that is explained above you can also fetch any interval of pages. If you provide the 5th page of a list and define the endPage parameter as 6 then you'll have the 5th and 6th pages only.

Compute Unit Consumption

The actor is optimized to run blazing fast and scrape as many items as possible. Therefore, it forefronts all the detailed requests. If the actor doesn't block very often it'll scrape 100 listings in less than 1 minute with ~0.01-0.03 compute units.

NPM Scraper Input example

1{
2  "startUrls":[
3    "https://www.npmjs.com/package/lodash",
4    "https://www.npmjs.com/~ljharb",
5    "https://www.npmjs.com/search?q=keywords:front-end&page=0&ranking=optimal",
6    "https://www.npmjs.com/search?q=keywords:modules",
7    "https://www.npmjs.com/search?q=axios"
8  ],
9  "endPage": 5,
10  "maxItems": 100,
11  "proxy":{
12    "useApifyProxy":true
13  }
14}

During the Run

During the run, the actor will output messages letting you know what is going on. Each message always contains a short label specifying which page from the provided list is currently specified. When items are loaded from the page, you should see a message about this event with a loaded item count and total item count for each page.

If you provide incorrect input to the actor, it will immediately stop with a failure state and output an explanation of what is wrong.

NPM Export

During the run, the actor stores results into a dataset. Each item is a separate item in the dataset.

You can manage the results in any language (Python, PHP, Node JS/NPM). See the FAQ or our API reference to learn more about getting results from this NPM actor.

Scraped NPM Properties

The structure of each package in NPM looks like this:

Package Detail

1{
2  "url": "https://www.npmjs.com/package/lodash",
3  "name": "lodash",
4  "description": "Lodash modular utilities.",
5  "maintainers": [
6    {
7      "name": "mathias",
8      "avatars": {
9        "small": "/npm-avatar/eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJhdmF0YXJVUkwiOiJodHRwczovL3MuZ3JhdmF0YXIuY29tL2F2YXRhci8yNGUwOGE5ZWE4NGRlYjE3YWUxMjEwNzRkMGYxNzEyNT9zaXplPTUwJmRlZmF1bHQ9cmV0cm8ifQ.1nyQBg2LJRuQRWzQgT_g8Hru5FIUsz2mCZ3yqtIGbPQ",
10        "medium": "/npm-avatar/eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJhdmF0YXJVUkwiOiJodHRwczovL3MuZ3JhdmF0YXIuY29tL2F2YXRhci8yNGUwOGE5ZWE4NGRlYjE3YWUxMjEwNzRkMGYxNzEyNT9zaXplPTEwMCZkZWZhdWx0PXJldHJvIn0.8O30NcKyPpUc911dhXJuyBnrSQx-tLyHhFYFPIj5VcQ",
11        "large": "/npm-avatar/eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJhdmF0YXJVUkwiOiJodHRwczovL3MuZ3JhdmF0YXIuY29tL2F2YXRhci8yNGUwOGE5ZWE4NGRlYjE3YWUxMjEwNzRkMGYxNzEyNT9zaXplPTQ5NiZkZWZhdWx0PXJldHJvIn0.XJZ0JP62pAmmumlu-AeTCji3D_2wleGxXGc86Sl5f4A"
12      }
13    },
14  ],
15  "dist-tags": {
16    "latest": "4.17.21"
17  },
18  "lastPublish": {
19    "maintainer": "bnjmnt4n",
20    "time": "2021-02-20T15:42:16.891Z"
21  },
22  "types": {
23    "typescript": {
24      "package": "@types/lodash"
25    }
26  },
27  "dependents": {
28    "dependentsCount": 171614,
29    "dependentsTruncated": [
30      "feebs-cli",
31      "fashionista",
32      "bless-brunch"
33    ]
34  },
35  "downloads": [
36    {
37      "downloads": 38796856,
38      "label": "2022-04-12 to 2022-04-18"
39    },
40  ],
41  "author": {
42    "name": "John-David Dalton",
43    "avatars": {
44      "small": "/npm-avatar/eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJhdmF0YXJVUkwiOiJodHRwczovL3MuZ3JhdmF0YXIuY29tL2F2YXRhci8yOTlhM2Q4OTFmZjE5MjBiNjljMzY0ZDA2MTAwNzA0Mz9zaXplPTUwJmRlZmF1bHQ9cmV0cm8ifQ.B9Vy-ZZRADWO6SYXG9debouY1FRb7AMhNqg8rO76fOw",
45      "medium": "/npm-avatar/eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJhdmF0YXJVUkwiOiJodHRwczovL3MuZ3JhdmF0YXIuY29tL2F2YXRhci8yOTlhM2Q4OTFmZjE5MjBiNjljMzY0ZDA2MTAwNzA0Mz9zaXplPTEwMCZkZWZhdWx0PXJldHJvIn0.c1-LKdixh9shjLtVzpFB2qCaSFIyeMSRg89KtYrscAw",
46      "large": "/npm-avatar/eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJhdmF0YXJVUkwiOiJodHRwczovL3MuZ3JhdmF0YXIuY29tL2F2YXRhci8yOTlhM2Q4OTFmZjE5MjBiNjljMzY0ZDA2MTAwNzA0Mz9zaXplPTQ5NiZkZWZhdWx0PXJldHJvIn0.wTHgJF3dJNHDR2eKg8YGLK6neCgefwB7LCWcRPgQErw"
47    }
48  },
49  "homepage": "https://lodash.com/",
50  "keywords": [
51    "modules",
52    "stdlib",
53    "util"
54  ],
55  "license": "MIT",
56  "repository": "https://github.com/lodash/lodash",
57  "versions": [
58    {
59      "version": "4.17.21",
60      "date": {
61        "ts": 1613835736891,
62        "rel": "2 years ago"
63      },
64      "dist": {
65        "integrity": "sha512-v2kDEe57lecTulaDIuNTPy3Ry4gLGJ6Z1O3vE1krgXZNrsQ+LFTGHVxVjcXPs17LhbZVGedAJv8XZ1tvj5FvSg==",
66        "shasum": "679591c564c3bffaae8454cf0b3df370c3d6911c",
67        "tarball": "https://registry.npmjs.org/lodash/-/lodash-4.17.21.tgz",
68        "fileCount": 1054,
69        "unpackedSize": 1412415,
70        "npm-signature": "-----BEGIN PGP SIGNATURE-----\r\nVersion: OpenPGP.js v3.0.13\r\nComment: https://openpgpjs.org\r\n\r\nwsFcBAEBCAAQBQJgMS3ZCRA9TVsSAnZWagAA8+4P/jx+SJ6Ue5oAJjz0L7gw\nLDD5YvP8aoliFq4GYkwUXfVQvOwomIPfa+U5Kao/hDfuwFQ/Bq5D5nSsl2bj\nrjJgvlKXna0SId8AgDgY2fB7zSfninuJvalY4iTWMN8DFSpG0XE2QFfoKpd3\njDmuzcNtgr79QV6DgjOVkHiP1IGNDlLTc1QEKiwo/5CdGQi1q/iCj6dViQMJ\nByuuuV2Qzi3f/FI25cG797WZar1MHhhlcnB50HiVBGp54IZOyuqdqWPduZQo\nvhONtonxPGBm3/J+uAkeUSSyL3Ud+FzLvdg8WEI9gDL0yvU4k0FcsnOONEYn\nngLaKEsw2xAnPBYW3Lf73Jnpwx6FAT3k49kgzxiNYSxEo7x4wiuNtBoDMyNw\nEKj6SZ0bUNmaJgiMfDnnDjCKjI3JrO1hho8z6CkwuvxuWLlW9wSsVayggzAI\nEhfeTeISugVHh332oDY2MI/Ysu8MnVN8fGmqeYQBBFj3aWatuA2NvVjACnX/\n54G7FtCU8TxZpm9shFRSopBx8PeI3r+icx1CT8YVFypY416PLnidHyqtME1G\neuRd1nWEz18hvVUAEHmuvHo+EPP3tITmTTUPQcZGMdBcZC+4UBmPMWX466HE\nbHw4aOnUWMa0sWfsERC5xzRZAb4lgMPEoTOnZyN4usMy7x9TzGZKZvU24HUE\nmpae\r\n=NOmG\r\n-----END PGP SIGNATURE-----\r\n",
71        "signatures": [
72          {
73            "keyid": "SHA256:jl3bwswu80PjjokCgh0o2w5c2U4LhQAE57gj9cz1kzA",
74            "sig": "MEUCIF3Yithbtmy1aEBNlfNWbLswAfPIyQUuNUGARD3Ex2t4AiEA6TlN2ZKJCUpS/Sf2Z6MduF1BNSvayHIpu5wAcICcKXw="
75          }
76        ]
77      }
78    },
79  ],
80  "readme": "<h1>\n<a id=\"user-content-lodash-v41721\" class=\"anchor\" href=\"#lodash-v41721\" aria-hidden=\"true\"><span aria-hidden=\"true\" class=\"octicon octicon-link\"></span></a>lodash v4.17.21</h1>\n<p>The <a href=\"https://lodash.com/\" rel=\"nofollow\">Lodash</a> library exported as <a href=\"https://nodejs.org/\" rel=\"nofollow\">Node.js</a> modules.</p>\n<h2>\n<a id=\"user-content-installation\" class=\"anchor\" href=\"#installation\" aria-hidden=\"true\"><span aria-hidden=\"true\" class=\"octicon octicon-link\"></span></a>Installation</h2>\n<p>Using npm:</p>\n<div class=\"highlight highlight-source-shell\"><pre>$ npm i -g npm\n$ npm i --save lodash</pre></div>\n<p>In Node.js:</p>\n<div class=\"highlight highlight-source-js\"><pre><span class=\"pl-c\">// Load the full build.</span>\n<span class=\"pl-k\">var</span> <span class=\"pl-s1\">_</span> <span class=\"pl-c1\">=</span> <span class=\"pl-en\">require</span><span class=\"pl-kos\">(</span><span class=\"pl-s\">'lodash'</span><span class=\"pl-kos\">)</span><span class=\"pl-kos\">;</span>\n<span class=\"pl-c\">// Load the core build.</span>\n<span class=\"pl-k\">var</span> <span class=\"pl-s1\">_</span> <span class=\"pl-c1\">=</span> <span class=\"pl-en\">require</span><span class=\"pl-kos\">(</span><span class=\"pl-s\">'lodash/core'</span><span class=\"pl-kos\">)</span><span class=\"pl-kos\">;</span>\n<span class=\"pl-c\">// Load the FP build for immutable auto-curried iteratee-first data-last methods.</span>\n<span class=\"pl-k\">var</span> <span class=\"pl-s1\">fp</span> <span class=\"pl-c1\">=</span> <span class=\"pl-en\">require</span><span class=\"pl-kos\">(</span><span class=\"pl-s\">'lodash/fp'</span><span class=\"pl-kos\">)</span><span class=\"pl-kos\">;</span>\n\n<span class=\"pl-c\">// Load method categories.</span>\n<span class=\"pl-k\">var</span> <span class=\"pl-s1\">array</span> <span class=\"pl-c1\">=</span> <span class=\"pl-en\">require</span><span class=\"pl-kos\">(</span><span class=\"pl-s\">'lodash/array'</span><span class=\"pl-kos\">)</span><span class=\"pl-kos\">;</span>\n<span class=\"pl-k\">var</span> <span class=\"pl-s1\">object</span> <span class=\"pl-c1\">=</span> <span class=\"pl-en\">require</span><span class=\"pl-kos\">(</span><span class=\"pl-s\">'lodash/fp/object'</span><span class=\"pl-kos\">)</span><span class=\"pl-kos\">;</span>\n\n<span class=\"pl-c\">// Cherry-pick methods for smaller browserify/rollup/webpack bundles.</span>\n<span class=\"pl-k\">var</span> <span class=\"pl-s1\">at</span> <span class=\"pl-c1\">=</span> <span class=\"pl-en\">require</span><span class=\"pl-kos\">(</span><span class=\"pl-s\">'lodash/at'</span><span class=\"pl-kos\">)</span><span class=\"pl-kos\">;</span>\n<span class=\"pl-k\">var</span> <span class=\"pl-s1\">curryN</span> <span class=\"pl-c1\">=</span> <span class=\"pl-en\">require</span><span class=\"pl-kos\">(</span><span class=\"pl-s\">'lodash/fp/curryN'</span><span class=\"pl-kos\">)</span><span class=\"pl-kos\">;</span></pre></div>\n<p>See the <a href=\"https://github.com/lodash/lodash/tree/4.17.21-npm\">package source</a> for more details.</p>\n<p><strong>Note:</strong><br>\nInstall <a href=\"https://www.npmjs.com/package/n_\" rel=\"nofollow\">n_</a> for Lodash use in the Node.js &lt; 6 REPL.</p>\n<h2>\n<a id=\"user-content-support\" class=\"anchor\" href=\"#support\" aria-hidden=\"true\"><span aria-hidden=\"true\" class=\"octicon octicon-link\"></span></a>Support</h2>\n<p>Tested in Chrome 74-75, Firefox 66-67, IE 11, Edge 18, Safari 11-12, &amp; Node.js 8-12.<br>\nAutomated <a href=\"https://saucelabs.com/u/lodash\" rel=\"nofollow\">browser</a> &amp; <a href=\"https://travis-ci.org/lodash/lodash/\" rel=\"nofollow\">CI</a> test runs are available.</p>\n"
81}

Contact

Please visit us through epctex.com to see all the products that are available for you. If you are looking for any custom integration or so, please reach out to us through the chat box in epctex.com. In need of support? devops@epctex.com is at your service.

Developer
Maintained by Community

Actor Metrics

  • 2 monthly users

  • 1 star

  • 97% runs succeeded

  • Created in Apr 2023

  • Modified 11 hours ago