# Docker Hub Scraper - Images, Stars & Pulls (`benthepythondev/dockerhub-scraper`) Actor

Search Docker Hub for container images by keyword: name, namespace, description, stars, pull count, official/verified flags and URL. Fast and reliable via Docker Hub's public search API. For DevOps research, image discovery and supply-chain analysis.

- **URL**: https://apify.com/benthepythondev/dockerhub-scraper.md
- **Developed by:** [ben](https://apify.com/benthepythondev) (community)
- **Categories:** Developer tools, Business, Other
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $1.50 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## 🐳 Docker Hub Scraper

Search **Docker Hub** for container images by keyword and get clean, structured data — name, namespace, description, stars, pull count, official/verified flags and URL. Powered by Docker Hub's public search API, so it's fast and reliable: no browser, no login, no API key.

Built for DevOps research, image discovery, security/supply-chain analysis and building image catalogs. Export to JSON/CSV/Excel, run on a schedule, call via API, or connect to Make, Zapier or n8n.

### 🔎 What is the Docker Hub Scraper?

Give it keywords (e.g. "nginx", "postgres") and it returns matching images as structured rows, sorted by relevance — with stars and pull counts so you can gauge popularity and trust.

#### What data does it extract?

- **Name** and **namespace** (owner)
- **Description**
- **Stars** and **pull count**
- **Official** and **automated** flags
- **Image URL** on Docker Hub

### ⬇️ Input

| Field | Type | Description |
|-------|------|-------------|
| `searchTerms` | array | Keywords to search, e.g. `nginx`. |
| `maxPerTerm` | integer | Max images per term. Default `30`. |

#### Example input

```json
{
  "searchTerms": ["postgres", "redis"],
  "maxPerTerm": 50
}
````

### ⬆️ Output

One record per image:

```json
{
  "name": "library/nginx",
  "namespace": "library",
  "image": "nginx",
  "description": "Official build of Nginx.",
  "stars": 20150,
  "pulls": 10000000000,
  "is_official": true,
  "is_automated": false,
  "url": "https://hub.docker.com/_/nginx",
  "query": "nginx"
}
```

### 💡 Use cases

- 🔧 **DevOps research** — find the most popular and trusted images.
- 🔎 **Image discovery** — surface alternatives for a tool.
- 🛡️ **Supply-chain analysis** — flag unofficial or low-trust images.
- 🤖 **LLM / app pipelines** — feed structured image metadata into your tools.

### ❓ FAQ

**Do I need an API key or login?** No — it uses Docker Hub's public search API.

**Can I search multiple terms?** Yes — pass several in `searchTerms`.

**Does it include pull counts?** Yes — plus stars and the official flag.

**How can I tell official images?** The `is_official` flag marks them.

**How many can I get?** Set `maxPerTerm` — it paginates automatically.

**How does pricing work?** Pay per image returned. No subscription.

**Is it legal?** It uses Docker Hub's public search API. Use responsibly and within Docker's terms.

### ⚙️ How it works

The scraper calls Docker Hub's search API directly and returns clean rows — no browser and no key. It paginates through results, de-duplicating as it goes, and normalizes each image into consistent fields. Runs are fast and dependable, which is why the actor keeps passing its daily health check. The same input shape works for a quick top-10 or a deep multi-term sweep — only `maxPerTerm` changes.

### 👥 Who uses Docker Hub data?

Image data is valuable to DevOps engineers, platform teams, security analysts and founders. An engineer compares images before adopting one; a security team audits which images are official and trusted; an analyst tracks the popularity of tools by pulls; a product feeds structured image metadata into a dashboard. Because every record is plain JSON with consistent fields, it drops straight into a spreadsheet, database, BI tool or LLM pipeline with no custom parsing.

### 📤 Export, schedule & integrate

Every run is saved to a dataset you can export to **JSON, CSV, Excel, XML or RSS**, or pull through the **Apify API**. Wire it into **Make, Zapier, n8n, Google Sheets, Slack** or your **own database**, run it on a **schedule** (hourly, daily or weekly) to keep your data fresh, and call it from AI agents through the **Apify MCP server**.

### 💡 Tips for best results

- Search a tool name to compare official vs community images.
- Sort by stars/pulls in your spreadsheet to rank trust and popularity.
- Schedule recurring runs and diff the output to track image growth.
- Combine several related terms in one run for a category overview.

### ❓ More FAQ

**How fresh is the data?** It is fetched live on each run — schedule runs to keep it current.

**Can I run it automatically?** Yes — use Apify Schedules (cron).

**Are duplicates removed?** Yes — images are de-duplicated within each run by name.

**Which export formats?** JSON, CSV, Excel, XML and RSS, plus the Apify API.

**Can AI agents use it?** Yes — via the Apify API and MCP server.

### 🔗 You might also like

- [GitHub Repository Scraper](https://apify.com/benthepythondev/github-repository-scraper) — repos, stars & topics.
- [PyPI Package Scraper](https://apify.com/benthepythondev/pypi-package-scraper) — Python package data.
- [Hugging Face Models Scraper](https://apify.com/benthepythondev/huggingface-models-scraper) — AI/ML models.

***

**Keywords:** docker hub scraper, dockerhub api, docker images, container images, devops research, image discovery, docker stars, docker pulls, official images, supply chain, container registry, docker data, image catalog, devops tools

# Actor input Schema

## `searchTerms` (type: `array`):

Keywords to search images, e.g. 'nginx', 'postgres'.

## `maxPerTerm` (type: `integer`):

Max images per search term.

## Actor input object example

```json
{
  "searchTerms": [
    "nginx"
  ],
  "maxPerTerm": 30
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "searchTerms": [
        "nginx"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("benthepythondev/dockerhub-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "searchTerms": ["nginx"] }

# Run the Actor and wait for it to finish
run = client.actor("benthepythondev/dockerhub-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "searchTerms": [
    "nginx"
  ]
}' |
apify call benthepythondev/dockerhub-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=benthepythondev/dockerhub-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Docker Hub Scraper - Images, Stars & Pulls",
        "description": "Search Docker Hub for container images by keyword: name, namespace, description, stars, pull count, official/verified flags and URL. Fast and reliable via Docker Hub's public search API. For DevOps research, image discovery and supply-chain analysis.",
        "version": "1.0",
        "x-build-id": "qyXDNYUL9RK3waf70"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/benthepythondev~dockerhub-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-benthepythondev-dockerhub-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/benthepythondev~dockerhub-scraper/runs": {
            "post": {
                "operationId": "runs-sync-benthepythondev-dockerhub-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/benthepythondev~dockerhub-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-benthepythondev-dockerhub-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "searchTerms": {
                        "title": "Search terms",
                        "type": "array",
                        "description": "Keywords to search images, e.g. 'nginx', 'postgres'.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxPerTerm": {
                        "title": "Max per term",
                        "minimum": 1,
                        "maximum": 250,
                        "type": "integer",
                        "description": "Max images per search term.",
                        "default": 30
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
