Pricing

from $8.00 / 1,000 results

🛡️ npm Dependency Tree Scraper

Scrape npm to map transitive dependency trees up to three levels deep. Extract exact license types, deprecation warnings, and active maintainer counts.

Pricing

from $8.00 / 1,000 results

Rating

0.0

(0)

Developer

太郎山田

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

📜 Open-Source License & Dependency Audit API

Modern software security demands strict oversight of supply chain vulnerabilities, making it essential to audit transitive dependency trees before deployment. This specialized npm dependency scraper evaluates packages directly from the web, allowing DevSecOps teams to search and map modules up to three levels deep without building complex custom web scraping tools. Instead of manually checking the registry website for deprecated modules, run this data extraction tool to schedule weekly compliance audits and catch hazardous copyleft licenses before they merge into your production branch.

Engineering leads rely on this scraper to continuously monitor repository health and extract specific maintainer signals. By scraping deep package details, the tool automatically outputs clean results including the exact license type, deprecation status, and active maintainer counts. Using these scraped outputs, you can enforce custom security policies that flag risky dependencies. The fast browser automation extracts a structured summary row per package, giving your team instant visibility into the exact risk profile of every module your application requires. Whether you input specific package names or batch search URLs, the extracted data helps teams confidently clear compliance reviews. Run the tool to parse complex nested trees and lock down your open-source pipeline with structured, reliable dependency data.

Store Quickstart

Run this actor with your target input. Results appear in the Apify Dataset and can be piped to webhooks for real-time delivery. Use dryRun to validate before committing to a schedule.

Key Features

• License classification — Categorize as permissive, weak-copyleft, strong-copyleft, or unknown
• Configurable policy — permissive (strict), copyleft-ok (tolerant), or custom allow/deny lists
• Transitive dependency crawl — Audit nested deps up to 3 levels deep
• Risk scoring — 0-100 score with A-F grades per package
• Maintainer signals — Maintainer count, last publish date, deprecation status
• Summary-first — One row per package with aggregate transitive risk

Use Cases

Who	Why
Developers	Automate recurring data fetches without building custom scrapers
Data teams	Pipe structured output into analytics warehouses
Ops teams	Monitor changes via webhook alerts
Product managers	Track competitor/market signals without engineering time

Input

Field	Type	Default	Description
packages	array	prefilled	npm package names to audit (max 200).
licensePolicy	string	`"permissive"`	Which license policy to apply: 'permissive' flags copyleft licenses as high risk, 'copyleft-ok' treats copyleft as mediu
allowList	array	`[]`	SPDX identifiers to treat as approved (only used when licensePolicy=custom).
denyList	array	`[]`	SPDX identifiers to treat as denied (only used when licensePolicy=custom).
maxDepth	integer	`1`	Maximum transitive dependency depth to crawl (0 = direct only, 1 = one level deep, etc.).
includeDevDeps	boolean	`false`	Also audit devDependencies of each package.
concurrency	integer	`5`	Number of parallel requests
timeoutMs	integer	`15000`	Request timeout in milliseconds

Input Example

{
  "packages": [
    "express",
    "react",
    "lodash",
    "axios"
  ],
  "licensePolicy": "permissive",
  "allowList": [],
  "denyList": [],
  "maxDepth": 1,
  "includeDevDeps": false,
  "concurrency": 5,
  "timeoutMs": 15000,
  "delivery": "dataset",
  "dryRun": false
}

Output

Field	Type	Description
`meta`	object
`results`	array
`results[].package`	string
`results[].version`	string
`results[].license`	string
`results[].licenseFamily`	string
`results[].riskLevel`	string
`results[].description`	string
`results[].author`	null
`results[].homepage`	string (url)
`results[].repository`	string (url)
`results[].maintainerCount`	number
`results[].lastPublish`	timestamp
`results[].daysSincePublish`	number
`results[].deprecated`	null
`results[].directDeps`	number
`results[].devDeps`	number
`results[].transitiveDeps`	number
`results[].transitiveRiskSummary`	object
`results[].score`	object
`results[].auditedAt`	timestamp
`results[].error`	null

Output Example

{
  "meta": {
    "generatedAt": "2026-06-15T12:00:00.000Z",
    "policy": "permissive",
    "maxDepth": 1,
    "totals": {
      "audited": 3,
      "errors": 0,
      "highRisk": 0,
      "mediumRisk": 0,
      "lowRisk": 3,
      "gradeA": 2,
      "gradeB": 1,
      "gradeC": 0,
      "gradeD": 0,
      "gradeF": 0,
      "deprecated": 0
    }
  },
  "results": [
    {
      "package": "express",
      "version": "4.21.0",
      "license": "MIT",
      "licenseFamily": "permissive",
      "riskLevel": "low",
      "description": "Fast, unopinionated, minimalist web framework",
      "author": null,
      "homepage": "http://expressjs.com/",
      "repository": "https://github.com/expressjs/express",
      "maintainerCount": 4,
      "lastPublish": "2024-09-11T00:00:00.000Z",
      "daysSincePublish": 45,
      "deprecated": null,
      "directDeps": 30,
      "devDeps": 0,
      "transitiveDeps": 48,
      "transitiveRiskSummary": {
        "total": 48,
        "high": 0,

API Usage

Run this actor programmatically using the Apify API. Replace YOUR_API_TOKEN with your token from Apify Console → Settings → Integrations.

cURL

curl -X POST "https://api.apify.com/v2/acts/taroyamada~open-source-license-dependency-audit/run-sync-get-dataset-items?token=YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{ "packages": [ "express", "react", "lodash", "axios" ], "licensePolicy": "permissive", "allowList": [], "denyList": [], "maxDepth": 1, "includeDevDeps": false, "concurrency": 5, "timeoutMs": 15000, "delivery": "dataset", "dryRun": false }'

Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("taroyamada/open-source-license-dependency-audit").call(run_input={
  "packages": [
    "express",
    "react",
    "lodash",
    "axios"
  ],
  "licensePolicy": "permissive",
  "allowList": [],
  "denyList": [],
  "maxDepth": 1,
  "includeDevDeps": false,
  "concurrency": 5,
  "timeoutMs": 15000,
  "delivery": "dataset",
  "dryRun": false
})

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

JavaScript / Node.js

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('taroyamada/open-source-license-dependency-audit').call({
  "packages": [
    "express",
    "react",
    "lodash",
    "axios"
  ],
  "licensePolicy": "permissive",
  "allowList": [],
  "denyList": [],
  "maxDepth": 1,
  "includeDevDeps": false,
  "concurrency": 5,
  "timeoutMs": 15000,
  "delivery": "dataset",
  "dryRun": false
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

Tips & Limitations

Schedule weekly runs against your production domains to catch config drift.
Use webhook delivery to pipe findings into your SIEM (Splunk, Datadog, Elastic).
For CI integration, block releases on critical severity findings using exit codes.
Combine with ssl-certificate-monitor for layered cert + headers coverage.
Findings include links to official remediation docs — share with dev teams via the webhook payload.

FAQ

Is running this against a third-party site legal?

Passive public-header scanning is generally permitted, but follow your own compliance policies. Only scan sites you have authorization for.

How often should I scan?

Weekly for production domains; daily if you have high config-change velocity.

Can I export to a compliance tool?

Use webhook delivery or Dataset API — formats map well to Drata, Vanta, OneTrust import templates.

Is this a penetration test?

No — this actor performs passive compliance scanning only. No exploitation, fuzzing, or auth bypass.

Does this qualify as a SOC2 control?

This actor produces evidence artifacts suitable for SOC2 CC7.1 (continuous monitoring). It is not itself a SOC2 certification.

Security & Compliance cluster — explore related Apify tools:

Privacy & Cookie Compliance Scanner | GDPR / CCPA Banner Audit — Scan public privacy pages and cookie banners for GDPR/CCPA compliance signals.
Security Headers Checker API | OWASP Audit — Bulk-audit websites for OWASP security headers, grade each response, and monitor header changes between runs.
SSL Certificate Monitor API | Expiry + Issuer Changes — Check SSL/TLS certificates in bulk, detect expiry and issuer changes, and emit alert-ready rows for ops and SEO teams.
DNS / SPF / DKIM / DMARC Audit API — Bulk-audit domains for SPF, DKIM, DMARC, MX, and email-auth posture with grades and fix-ready recommendations.
robots.txt AI Policy Monitor | GPTBot ClaudeBot — Detect GPTBot, ClaudeBot, Google-Extended, and other AI crawler policies in robots.
Data Breach Disclosure Monitor | HIPAA Breach Watch — Monitor the HHS OCR Breach Portal for new HIPAA data breach disclosures.
WCAG Accessibility Checker API | ADA & EAA Compliance Audit — Audit websites for WCAG 2.
Trust Center & Subprocessor Monitor API — Monitor vendor trust centers, subprocessor lists, DPA updates, and security posture changes.

Cost

Pay Per Event:

actor-start: $0.01 (flat fee per run)
dataset-item: $0.003 per output item

Example: 1,000 items = $0.01 + (1,000 × $0.003) = $3.01

No subscription required — you only pay for what you use.

⭐ Was this helpful?

If this actor saved you time, please leave a ★ rating on Apify Store. It takes 10 seconds, helps other developers discover it, and keeps updates free.

Bug report or feature request? Open an issue on the Issues tab of this actor.

npm Package Dependency Intelligence

taroyamada/npm-package-dependency-intelligence

Analyze npm package metadata, versions, dependencies, maintainer hints, release cadence, and package risk signals using the official npm registry API.

太郎山田

NPM Package Scraper & Dependency Analyzer

andok/npm-metadata-extractor

Fetch full NPM package metadata, versions, and dependencies in bulk. Build developer intelligence dashboards without rate limits.

Andok

Npm Registry Scraper

klondikeking/npm-registry-scraper

Pierrick McD0nald

npm Package Maintainer Leads & Email Finder

wishful_knowledge/npm-package-maintainer-leads-scraper

Find npm package maintainer leads and extract repositories, homepages, public business emails, community signals, and developer-tool outreach scores.

sanfeng zhang

NPM Packages Scraper

gio21/npm-packages-scraper

Search and scrape NPM packages by keyword. Extract name, version, description, downloads, dependencies, license, repository, and quality scores. Uses the public NPM registry API. Pay per result.

Gio

npm Registry Scraper - Search & Download Stats

parseforge/npm-registry-scraper

Search and scrape npm package data including versions, descriptions, authors, licenses, keywords, and weekly/total download counts from the public npm registry API.

ParseForge

npm License & Deprecation Checker

taroyamada/npm-package-intelligence

Audit npm libraries for deprecated versions, abandoned repositories, and specific open-source licenses to maintain healthy JavaScript supply chains.

太郎山田

npm Package Scraper - Registry Data & Downloads API

benthepythondev/npm-package-scraper

Extract npm package data with download stats, dependencies, maintainers, and quality scores. Search by keyword, author, or get popular packages. Perfect for JavaScript ecosystem research, competitor tracking, and dependency analysis. Fast API-based extraction, pay-per-result.

ben

NPM Package Stats Scraper. Downloads, Versions, Dependencies

seemuapps/npm-package-stats-scraper

Get download counts, version history, dependencies, license, repo, and maintainer info for any npm package. Bulk-process a list of packages in one run.

Andrew

NPM Package Scraper — Downloads, Maintainers, Deps & SBOM

logiover/npm-package-intelligence-scraper

Export every NPM package by keyword, maintainer, scope or name. Get version, license, repo URL, maintainers, daily/weekly/monthly downloads, dependents, deprecation, full deps, version history. Official NPM registry + stats API. For devtool intel, SBOM and OSS outreach.