🐳 Docker Hub Scraper β€” Images & Pull Counts avatar

🐳 Docker Hub Scraper β€” Images & Pull Counts

Pricing

from $20.00 / 1,000 results

Go to Apify Store
🐳 Docker Hub Scraper β€” Images & Pull Counts

🐳 Docker Hub Scraper β€” Images & Pull Counts

Extract Docker Hub image data β€” pull counts, tags, descriptions, maintainers, version history. Snyk, Anchore & Sysdig alternative for container intelligence, SBOMs, supply-chain audits and DevOps dashboards. Pay per image.

Pricing

from $20.00 / 1,000 results

Rating

0.0

(0)

Developer

Stephan Corbeil

Stephan Corbeil

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

0

Monthly active users

3 days ago

Last modified

Categories

Share

🐳 Docker Hub Scraper β€” Images, Tags, Pull Counts & Vulnerability Signals

Pay-per-result Docker Hub scraper β€” extracts image metadata, tag list, pull counts, last-push timestamps, vulnerabilities reported by Docker Scout, README, and dependency layer info. Built for container security teams, devtool marketers, and OSS-funded analytics as a no-rate-limit alternative to Docker Hub's official API (anonymous 100 pulls/6h; authenticated 200; subscribers 5000+), Docker Scout enterprise tier ($21+/user/mo), Snyk Container ($25-58/user/mo), Aqua Trivy Cloud, and Anchore Enterprise.

Why Docker Hub Scraper Beats the Docker Hub API, Snyk & Docker Scout

FeatureNexGenData Docker Hub ScraperDocker Hub official APISnyk ContainerDocker Scout
Cost$0.002 / image, pay-per-resultFree + pull-rate-limited$25-58 / user / month$21+ / user / month
Pull-rate capNone for end user100-5000 pulls / 6hPlan-dependentPlan-dependent
AuthApify tokenDocker ID + planAccount + planDocker subscription
Bulk image scanYesPer-call REST + paginationYesYes
Vulnerability signalsYesLimitedYes β€” deep CVE detailYes β€” Scout findings
Pull-count historyYesPer-image onlyLimitedYes
Free trialFree Apify credits on signupFree for low volume14-day trialFree tier on Docker Personal

Container-security teams and devtool competitive analysts pick this actor instead of wrestling with Docker Hub's API pull-rate limits (especially the anonymous 100-per-6-hours that kills any bulk scan). It is a drop-in alternative to Docker Scout's enterprise tier when you need image metadata + Scout-equivalent vulnerability counts but not the full Scout dashboard.

What You Get Per Image

Each dataset item is a flat JSON record:

  • namespace, repository, full_name, description, short_description
  • is_official, is_verified_publisher, is_automated
  • star_count, pull_count, last_updated
  • categories, architectures β€” amd64, arm64, ppc64le, s390x, riscv64, etc.
  • tags β€” array of {name, digest, size_bytes, last_pushed, architectures}
  • latest_tag_size_mb, tag_count, unique_digests
  • base_image, os_versions
  • vulnerability_summary β€” {critical, high, medium, low}
  • top_vulnerabilities β€” array of {cve_id, severity, package, fixed_in} (when Docker Scout data is public)
  • dependency_layers_count, total_layers
  • readme_text, dockerfile_url, source_repo_url, license

Use Cases

  • Container-security teams β€” bulk-scan your organization's full registry catalog and rank by CVE count
  • Devtool marketers β€” find images using your competitor's base image and pitch a migration
  • Open-source intelligence β€” track adoption velocity of new base images (Alpine vs Distroless vs Wolfi)
  • Engineering procurement β€” benchmark target-company image hygiene during diligence
  • Internal platform teams β€” audit your private registry's pull-count distribution
  • Newsletter / Substack content β€” automate "top 10 fastest-growing Docker images this month"
  • SBOM workflows β€” feed the dependency-layer data into your bill-of-materials pipeline

Quick Start

from apify_client import ApifyClient
client = ApifyClient("YOUR_APIFY_TOKEN")
run = client.actor("nexgendata/dockerhub-scraper").call(run_input={
"queries": ["nginx", "redis", "postgres"],
"namespaces": ["library", "bitnami"],
"includeVulnerabilities": True,
"maxResults": 500
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(item["full_name"], item["pull_count"], item["vulnerability_summary"])

Pricing

Pay-per-event β€” no Docker Hub subscription required, no monthly minimum.

  • Actor Start: $0.0001
  • Per image: $0.002
  • Per tag enrichment: $0.0005

A 500-image namespace audit with vulnerability data costs about $2-3. Same data via Docker Hub API + Docker Scout subscription is gated behind a $21+/user/month tier.

Use caseActor
GitHub repos + stars + contributorsGitHub Scraper
GitLab projects + MRsGitLab Scraper
GitHub trending feedGitHub Trending Scraper
npm package download statsnpm Package Stats
PyPI package download statsPyPI Package Stats
Dev.to articles + dev audienceDev.to Scraper
Developer Tools MCP ServerDeveloper Tools MCP Server
Tech-stack / Wappalyzer replacementWappalyzer Replacement

FAQ

Q: How do you bypass Docker Hub's pull-rate cap? We don't pull image binaries β€” we extract metadata, tag lists, and pull counts via Docker Hub's public web pages and the metadata API. Image binaries are not downloaded.

Q: Do you scan vulnerabilities yourself? We surface vulnerability counts that Docker Hub exposes publicly via Docker Scout. We do not run our own scanner β€” that would require pulling the binary.

Q: Can I do this for private registries (ECR, GCR, ACR)? This actor is Docker Hub only. For private-registry scanning, layer on your cloud provider's native scanner.

Q: How fresh is pull-count data? Live per run. Docker Hub aggregates pull-count over time β€” we capture the cumulative value at scan time.

Q: Schema stability? Field names are versioned. We track Docker Hub's web-page DOM and ship parser updates within 24 hours of breaking changes.

Q: Multi-arch image support? Yes β€” architectures lists every CPU/OS combo for each tag. Multi-arch manifests are flattened across one row per logical tag.

About NexGenData

NexGenData publishes 260+ buyer-intent actors covering SEC filings, YC alumni, lead generation, competitive intelligence, stock fundamentals across 30+ exchanges, and more. All pay-per-result. Browse the full catalog at https://apify.com/nexgendata?fpr=2ayu9b


How NexGenData Pricing Works

Every NexGenData actor uses pay-per-event pricing β€” you only pay for results that actually land in your dataset. No monthly minimum, no seat fees, no surprise overage bills.

  • Actor Start: a single-event charge each time you spin the actor up (scaled to memory size)
  • Result / item: charged per item written to the default dataset
  • No charge for retries, internal proxy rotation, or failed sub-requests β€” those are absorbed by the platform

Apify Platform Bonus

New to Apify? Sign up with the NexGenData referral link β€” you get free platform credits on signup (enough for several thousand free results) and you help fund the maintenance of this actor fleet.

Integration Surface

Every actor in the NexGenData catalog can be triggered from:

  • Apify console β€” point-and-click run
  • Apify API β€” REST + webhooks
  • Apify Python / JS SDKs β€” programmatic batch
  • Zapier, Make.com, n8n β€” official integrations
  • MCP β€” many actors are exposed as MCP tools for Claude / ChatGPT / Cursor agents
  • Schedules β€” built-in cron for daily / weekly / monthly runs
  • Webhooks β€” POST results to any HTTPS endpoint on dataset write

Support

NexGenData maintains 260+ Apify actors and ships updates regularly. Bug reports via the Apify console issues tab get a response within 24 hours. Roadmap requests are welcome β€” high-demand features ship in the next version.

Home: thenextgennexus.com Full catalog: apify.com/nexgendata