Dockerhub Scraper avatar

Dockerhub Scraper

Pricing

Pay per event

Go to Apify Store
Dockerhub Scraper

Dockerhub Scraper

Search Docker Hub and extract repository data. Get image names, descriptions, star counts, pull counts, and official status. Search multiple keywords in one run.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Stas Persiianenko

Stas Persiianenko

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Categories

Share

Docker Hub Scraper

Scrape Docker Hub repositories from hub.docker.com. Search by keyword and get image names, descriptions, star counts, pull counts, and official status.

What does Docker Hub Scraper do?

Docker Hub Scraper uses the Docker Hub search API to find container image repositories. It extracts repository names, descriptions, star counts, pull counts, official image status, and automated build status. Search for specific technologies or browse popular images.

Why scrape Docker Hub?

Docker Hub is the world's largest container image registry with millions of repositories. It's the primary source for understanding container image adoption and popularity.

Key reasons to scrape it:

  • Infrastructure research — Find popular base images and tools for your stack
  • Adoption metrics — Track pull counts to gauge technology adoption
  • Competitive analysis — Monitor competing images in your technology space
  • Security research — Identify widely-used images for vulnerability assessment
  • DevOps intelligence — Track trends in containerized tools and services

Use cases

  • DevOps engineers finding the best base images for their infrastructure
  • Platform teams researching container ecosystem options
  • Security teams auditing widely-deployed container images
  • Technical writers finding popular images for container tutorials
  • Cloud architects evaluating technology stack options
  • Researchers studying container adoption patterns

How to scrape Docker Hub

  1. Go to Docker Hub Scraper on Apify Store
  2. Enter one or more search keywords
  3. Set result limits
  4. Click Start and wait for results
  5. Download data as JSON, CSV, or Excel

Input parameters

ParameterTypeDefaultDescription
searchQueriesstring[](required)Keywords to search for
maxResultsPerSearchinteger50Max repositories per keyword
maxPagesinteger3Max pages (25 repos per page)

Input example

{
"searchQueries": ["nginx", "redis"],
"maxResultsPerSearch": 25,
"maxPages": 1
}

Output

Each repository in the dataset contains:

FieldTypeDescription
namestringRepository name
descriptionstringShort description
starsnumberStar count
pullsnumberTotal pull count
isOfficialbooleanOfficial Docker image
isAutomatedbooleanAutomated build enabled
dockerHubUrlstringDocker Hub page URL
scrapedAtstringISO timestamp of extraction

Output example

{
"name": "nginx",
"description": "Official build of Nginx.",
"stars": 21196,
"pulls": 12827491090,
"isOfficial": true,
"isAutomated": false,
"dockerHubUrl": "https://hub.docker.com/_/nginx",
"scrapedAt": "2026-03-03T03:56:31.123Z"
}

Pricing

Docker Hub Scraper uses pay-per-event pricing:

EventPrice
Run started$0.001
Repository extracted$0.001 per repo

Cost examples

ScenarioReposCost
Quick search25$0.026
Category survey100$0.101
Large analysis250$0.251

Platform costs are negligible — typically under $0.001 per run.

Using Docker Hub Scraper with the Apify API

Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('automation-lab/dockerhub-scraper').call({
searchQueries: ['nginx'],
maxResultsPerSearch: 25,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(`Found ${items.length} repositories`);
items.forEach(repo => {
const official = repo.isOfficial ? '[Official]' : '';
console.log(`${official} ${repo.name} (${repo.pulls.toLocaleString()} pulls, ${repo.stars} stars)`);
});

Python

from apify_client import ApifyClient
client = ApifyClient('YOUR_API_TOKEN')
run = client.actor('automation-lab/dockerhub-scraper').call(run_input={
'searchQueries': ['nginx'],
'maxResultsPerSearch': 25,
})
dataset = client.dataset(run['defaultDatasetId']).list_items().items
print(f'Found {len(dataset)} repositories')
for repo in dataset:
official = '[Official]' if repo['isOfficial'] else ''
print(f"{official} {repo['name']} ({repo['pulls']:,} pulls, {repo['stars']} stars)")

Integrations

Docker Hub Scraper works with all Apify integrations:

  • Scheduled runs — Track image popularity trends over time
  • Webhooks — Get notified when a scrape completes
  • API — Trigger runs and fetch results programmatically
  • Google Sheets — Export repository data to a spreadsheet
  • Slack — Share popular images with your team

Connect to Zapier, Make, or Google Sheets for automated workflows.

Tips

  • Filter by isOfficial: true to find Docker-maintained base images
  • Sort by pull count to identify the most widely adopted images
  • Compare star counts to gauge community engagement
  • Search for specific technologies (e.g. "postgres", "node") to find relevant images
  • Track pull counts over time with scheduled runs to spot adoption trends
  • Multiple keywords let you compare adoption across technology categories

FAQ

How many repositories can I search? Each page returns 25 repositories. With pagination, you can fetch hundreds per keyword.

Does it include tag/version information? The search API returns repository-level metadata. For individual tags and versions, you'd need to query the tag-specific endpoints.

Are private repositories included? No — only public repositories appear in Docker Hub search results.

What do pull counts represent? Pull counts reflect the total number of times an image has been pulled (downloaded) from Docker Hub across all time.

How often are pull counts updated? Pull counts are updated in near real-time as images are downloaded.