🛡️ npm Dependency Tree Scraper
Pricing
from $8.00 / 1,000 results
🛡️ npm Dependency Tree Scraper
Scrape npm to map transitive dependency trees up to three levels deep. Extract exact license types, deprecation warnings, and active maintainer counts.
Pricing
from $8.00 / 1,000 results
Rating
0.0
(0)
Developer
太郎 山田
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
8 days ago
Last modified
Categories
Share
📜 Open-Source License & Dependency Audit API
Modern software security demands strict oversight of supply chain vulnerabilities, making it essential to audit transitive dependency trees before deployment. This specialized npm dependency scraper evaluates packages directly from the web, allowing DevSecOps teams to search and map modules up to three levels deep without building complex custom web scraping tools. Instead of manually checking the registry website for deprecated modules, run this data extraction tool to schedule weekly compliance audits and catch hazardous copyleft licenses before they merge into your production branch.
Engineering leads rely on this scraper to continuously monitor repository health and extract specific maintainer signals. By scraping deep package details, the tool automatically outputs clean results including the exact license type, deprecation status, and active maintainer counts. Using these scraped outputs, you can enforce custom security policies that flag risky dependencies. The fast browser automation extracts a structured summary row per package, giving your team instant visibility into the exact risk profile of every module your application requires. Whether you input specific package names or batch search URLs, the extracted data helps teams confidently clear compliance reviews. Run the tool to parse complex nested trees and lock down your open-source pipeline with structured, reliable dependency data.
Store Quickstart
Run this actor with your target input. Results appear in the Apify Dataset and can be piped to webhooks for real-time delivery. Use dryRun to validate before committing to a schedule.
Key Features
- • License classification — Categorize as permissive, weak-copyleft, strong-copyleft, or unknown
- • Configurable policy —
permissive(strict),copyleft-ok(tolerant), orcustomallow/deny lists - • Transitive dependency crawl — Audit nested deps up to 3 levels deep
- • Risk scoring — 0-100 score with A-F grades per package
- • Maintainer signals — Maintainer count, last publish date, deprecation status
- • Summary-first — One row per package with aggregate transitive risk
Use Cases
| Who | Why |
|---|---|
| Developers | Automate recurring data fetches without building custom scrapers |
| Data teams | Pipe structured output into analytics warehouses |
| Ops teams | Monitor changes via webhook alerts |
| Product managers | Track competitor/market signals without engineering time |
Input
| Field | Type | Default | Description |
|---|---|---|---|
| packages | array | prefilled | npm package names to audit (max 200). |
| licensePolicy | string | "permissive" | Which license policy to apply: 'permissive' flags copyleft licenses as high risk, 'copyleft-ok' treats copyleft as mediu |
| allowList | array | [] | SPDX identifiers to treat as approved (only used when licensePolicy=custom). |
| denyList | array | [] | SPDX identifiers to treat as denied (only used when licensePolicy=custom). |
| maxDepth | integer | 1 | Maximum transitive dependency depth to crawl (0 = direct only, 1 = one level deep, etc.). |
| includeDevDeps | boolean | false | Also audit devDependencies of each package. |
| concurrency | integer | 5 | Number of parallel requests |
| timeoutMs | integer | 15000 | Request timeout in milliseconds |
Input Example
{"packages": ["express","react","lodash","axios"],"licensePolicy": "permissive","allowList": [],"denyList": [],"maxDepth": 1,"includeDevDeps": false,"concurrency": 5,"timeoutMs": 15000,"delivery": "dataset","dryRun": false}
Output
| Field | Type | Description |
|---|---|---|
meta | object | |
results | array | |
results[].package | string | |
results[].version | string | |
results[].license | string | |
results[].licenseFamily | string | |
results[].riskLevel | string | |
results[].description | string | |
results[].author | null | |
results[].homepage | string (url) | |
results[].repository | string (url) | |
results[].maintainerCount | number | |
results[].lastPublish | timestamp | |
results[].daysSincePublish | number | |
results[].deprecated | null | |
results[].directDeps | number | |
results[].devDeps | number | |
results[].transitiveDeps | number | |
results[].transitiveRiskSummary | object | |
results[].score | object | |
results[].auditedAt | timestamp | |
results[].error | null |
Output Example
{"meta": {"generatedAt": "2026-06-15T12:00:00.000Z","policy": "permissive","maxDepth": 1,"totals": {"audited": 3,"errors": 0,"highRisk": 0,"mediumRisk": 0,"lowRisk": 3,"gradeA": 2,"gradeB": 1,"gradeC": 0,"gradeD": 0,"gradeF": 0,"deprecated": 0}},"results": [{"package": "express","version": "4.21.0","license": "MIT","licenseFamily": "permissive","riskLevel": "low","description": "Fast, unopinionated, minimalist web framework","author": null,"homepage": "http://expressjs.com/","repository": "https://github.com/expressjs/express","maintainerCount": 4,"lastPublish": "2024-09-11T00:00:00.000Z","daysSincePublish": 45,"deprecated": null,"directDeps": 30,"devDeps": 0,"transitiveDeps": 48,"transitiveRiskSummary": {"total": 48,"high": 0,
API Usage
Run this actor programmatically using the Apify API. Replace YOUR_API_TOKEN with your token from Apify Console → Settings → Integrations.
cURL
curl -X POST "https://api.apify.com/v2/acts/taroyamada~open-source-license-dependency-audit/run-sync-get-dataset-items?token=YOUR_API_TOKEN" \-H "Content-Type: application/json" \-d '{ "packages": [ "express", "react", "lodash", "axios" ], "licensePolicy": "permissive", "allowList": [], "denyList": [], "maxDepth": 1, "includeDevDeps": false, "concurrency": 5, "timeoutMs": 15000, "delivery": "dataset", "dryRun": false }'
Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")run = client.actor("taroyamada/open-source-license-dependency-audit").call(run_input={"packages": ["express","react","lodash","axios"],"licensePolicy": "permissive","allowList": [],"denyList": [],"maxDepth": 1,"includeDevDeps": false,"concurrency": 5,"timeoutMs": 15000,"delivery": "dataset","dryRun": false})for item in client.dataset(run["defaultDatasetId"]).iterate_items():print(item)
JavaScript / Node.js
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });const run = await client.actor('taroyamada/open-source-license-dependency-audit').call({"packages": ["express","react","lodash","axios"],"licensePolicy": "permissive","allowList": [],"denyList": [],"maxDepth": 1,"includeDevDeps": false,"concurrency": 5,"timeoutMs": 15000,"delivery": "dataset","dryRun": false});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(items);
Tips & Limitations
- Schedule weekly runs against your production domains to catch config drift.
- Use webhook delivery to pipe findings into your SIEM (Splunk, Datadog, Elastic).
- For CI integration, block releases on
criticalseverity findings using exit codes. - Combine with
ssl-certificate-monitorfor layered cert + headers coverage. - Findings include links to official remediation docs — share with dev teams via the webhook payload.
FAQ
Is running this against a third-party site legal?
Passive public-header scanning is generally permitted, but follow your own compliance policies. Only scan sites you have authorization for.
How often should I scan?
Weekly for production domains; daily if you have high config-change velocity.
Can I export to a compliance tool?
Use webhook delivery or Dataset API — formats map well to Drata, Vanta, OneTrust import templates.
Is this a penetration test?
No — this actor performs passive compliance scanning only. No exploitation, fuzzing, or auth bypass.
Does this qualify as a SOC2 control?
This actor produces evidence artifacts suitable for SOC2 CC7.1 (continuous monitoring). It is not itself a SOC2 certification.
Related Actors
Security & Compliance cluster — explore related Apify tools:
- Privacy & Cookie Compliance Scanner | GDPR / CCPA Banner Audit — Scan public privacy pages and cookie banners for GDPR/CCPA compliance signals.
- Security Headers Checker API | OWASP Audit — Bulk-audit websites for OWASP security headers, grade each response, and monitor header changes between runs.
- SSL Certificate Monitor API | Expiry + Issuer Changes — Check SSL/TLS certificates in bulk, detect expiry and issuer changes, and emit alert-ready rows for ops and SEO teams.
- DNS / SPF / DKIM / DMARC Audit API — Bulk-audit domains for SPF, DKIM, DMARC, MX, and email-auth posture with grades and fix-ready recommendations.
- robots.txt AI Policy Monitor | GPTBot ClaudeBot — Detect GPTBot, ClaudeBot, Google-Extended, and other AI crawler policies in robots.
- Data Breach Disclosure Monitor | HIPAA Breach Watch — Monitor the HHS OCR Breach Portal for new HIPAA data breach disclosures.
- WCAG Accessibility Checker API | ADA & EAA Compliance Audit — Audit websites for WCAG 2.
- Trust Center & Subprocessor Monitor API — Monitor vendor trust centers, subprocessor lists, DPA updates, and security posture changes.
Cost
Pay Per Event:
actor-start: $0.01 (flat fee per run)dataset-item: $0.003 per output item
Example: 1,000 items = $0.01 + (1,000 × $0.003) = $3.01
No subscription required — you only pay for what you use.
⭐ Was this helpful?
If this actor saved you time, please leave a ★ rating on Apify Store. It takes 10 seconds, helps other developers discover it, and keeps updates free.
Bug report or feature request? Open an issue on the Issues tab of this actor.