Docker Hub Scraper
Pricing
from $3.00 / 1,000 results
Docker Hub Scraper
Scrape Docker Hub, container image search, pull counts, star counts, publisher and verified-publisher data, tags, architectures, OS support, categories, and user/org profiles. Pure HTTP, no auth required
Pricing
from $3.00 / 1,000 results
Rating
5.0
(7)
Developer
Crawler Bros
Maintained by CommunityActor stats
7
Bookmarked
2
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share
Scrape Docker Hub — the world's largest container image registry. Search images, get pull counts, star counts, publisher info, architectures, OS support, categories, tags, and user/org profiles. Pure HTTP via the public hub.docker.com REST API. No auth, no cookies, no proxy required.
What this actor does
- Six modes:
search,byImages,userProfile,repoTags,topByCategory,namespaceRepos - Real-time pull and star counts for every public image
- Architecture / OS support — amd64, arm64, arm, 386, ppc64le, s390x, riscv64, mips64le; Linux & Windows
- Categories — 16 official Docker Hub categories (Databases, Web Servers, Languages, Operating Systems, etc.)
- Publisher metadata — verified publishers, Docker Official Images, sponsored open-source
- Tags — per-tag size, digest, push date, platform manifest list
- Profiles — user and organization profile data including verified-publisher badges
- Rich filtering — pull range, star range, official-only, verified-only, architecture, OS, keyword
- Empty fields are omitted
Output per image
namespace,name,fullName— e.g.library/postgresshortDescription,descriptiontype—image/pluginisOfficial,isVerifiedPublisher,isAutomated,isArchived,isPrivatepullCount(numeric),pullCountDisplay(e.g.1B+)starCountlastUpdated,lastPulled,lastModified,dateRegisteredstatus—active/archivedpublisher—{ name, id, isOfficial?, isVerified? }categories[]— e.g.Databases & storage,Web serversarchitectures[]— e.g.amd64,arm64,s390xoperatingSystems[]—linux,windowsmediaTypes[],contentTypes[]logoUrl— CDN-hosted publisher logorepoUrl— canonical hub.docker.com URLsource—store(official),verified_publisher, etc.storageSize(bytes, where available)recordType: "image",scrapedAt
Output per tag (mode=repoTags)
namespace,name,fullName,tagNamefullSize,lastPushed,lastPulledlastUpdaterUsername,digestarchitecture,os— primary platformplatforms[]—{ architecture, os, variant?, size, digest, status }per manifestmediaType,contentTyperepoUrl,baseRepoUrlrecordType: "tag",scrapedAt
Output per user/org (mode=userProfile)
username,fullName,type(User/Organization)company,location,profileUrl,dateJoinedavatarUrl(Gravatar)badge—verified_publisher/official/open_sourceisVerifiedPublisher,isOfficial,isActivedockerHubUrlrecordType: "userProfile",scrapedAt
Input
| Field | Type | Default | Description |
|---|---|---|---|
mode | enum | search | search / byImages / userProfile / repoTags / topByCategory / namespaceRepos |
searchQuery | string | postgres | Free-text query (mode=search) |
imageNames | array | – | namespace/repo strings (mode=byImages) |
namespace | string | – | User / org slug (mode=userProfile, repoTags, namespaceRepos) |
repository | string | – | Repository name (mode=repoTags) |
category | enum | – | Category slug (mode=topByCategory or as a filter) |
architectures | array enum | – | Filter to images supporting any selected arch |
operatingSystems | array enum | – | Filter to images supporting any selected OS |
isOfficial | bool | false | Only Docker Official Images |
isVerifiedPublisher | bool | false | Only Verified Publishers |
minStarCount | int | – | Drop images with fewer stars |
maxStarCount | int | – | Drop images with more stars |
minPullCount | int | – | Drop images with fewer pulls |
maxPullCount | int | – | Drop images with more pulls |
sortBy | enum | relevance | pull_count / star_count / updated_at / name |
containsKeyword | string | – | Substring filter on description/name (case-insensitive) |
includeUserRepos | bool | true | Also enumerate a user/org's repos in userProfile mode |
maxItems | int | 50 | Hard cap (1–1000) |
Example: search PostgreSQL images, official only
{"mode": "search","searchQuery": "postgres","isOfficial": true,"sortBy": "pull_count","maxItems": 20}
Example: lookup a list of specific images
{"mode": "byImages","imageNames": ["library/nginx","bitnami/redis","library/postgres","https://hub.docker.com/r/jenkins/jenkins"]}
Example: get all tags for a repository
{"mode": "repoTags","namespace": "library","repository": "postgres","maxItems": 100}
Example: top databases by pull count
{"mode": "topByCategory","category": "databases-and-storage","sortBy": "pull_count","maxItems": 30}
Example: user / organization profile + their repos
{"mode": "userProfile","namespace": "bitnami","includeUserRepos": true,"maxItems": 50}
Example: ARM64-only images for IoT deployments
{"mode": "search","searchQuery": "alpine","architectures": ["arm64"],"minStarCount": 100}
Use cases
- DevOps intelligence — discover production-ready images for your stack
- Security scanning — bulk-export verified publisher images for compliance review
- Container marketplaces — feed Docker Hub categories and metadata into your catalog
- Migration planning — find ARM64 / RISC-V replacements for amd64-only images
- Open-source analytics — track pull counts and stars to gauge ecosystem trends
- Competitive analysis — benchmark image popularity across alternative publishers
- Compliance — verify image provenance, last-updated dates, and publisher status
- Build pipelines — enumerate tag manifests for reproducible base-image pinning
FAQ
Do I need a Docker Hub account to use this actor? No. Docker Hub's public REST API does not require authentication for read access to public images, users, and orgs.
What's the difference between pullCount and pullCountDisplay? Docker Hub returns pull counts as a display string (1B+, 500M+) for popular images. The actor parses this into a numeric pullCount for sorting/filtering while keeping the display value in pullCountDisplay.
What are Docker Official Images? Images in the library/ namespace, curated and maintained by Docker Inc. They appear as library/postgres, library/nginx, etc. and are accessible via the https://hub.docker.com/_/postgres short URL.
What's a Verified Publisher? A company or open-source project that has been verified by Docker Inc. as the official source of an image (e.g. bitnami, jenkins, hashicorp). Verified images carry a verified_publisher badge.
Why are some architectures arrays missing entries like riscv64? Only architectures actually built for that image are listed. Most images target amd64 + arm64; only a few publishers (e.g. library/* official images) build for the full set.
Can I get all tags for a repository? Yes, use mode: "repoTags". The actor paginates through all tags up to maxItems.
What does source: "store" mean? Indicates the image is sourced from the Docker Official Images "store" (i.e. library/* namespace). Other sources include verified_publisher, community, and open_source.
How fresh is the data? Real-time — every request hits Docker Hub directly. Pull counts update continuously; tag push dates are accurate to the second.
Why doesn't repoTags mode return a recordType: "image" record for the parent repo? By design — repoTags emits one record per tag for fine-grained processing. Combine with byImages if you need parent-repo metadata.
Is there a rate limit? Docker Hub's public read endpoints have generous limits. The actor inserts small polite delays between requests and retries with exponential backoff on 429 / 5xx.
Data Source
This actor uses Docker Hub's public REST API at https://hub.docker.com:
/api/search/v3/catalog/search— full-text image search with filters/v2/repositories/{namespace}/{repo}/— repository details/v2/repositories/{namespace}/{repo}/tags/— tag manifests/v2/repositories/{namespace}/— all repos in a namespace/v2/users/{username}/and/v2/orgs/{org}/— user and organization profiles/v2/categories— official category taxonomy/v2/search/repositories/— legacy search endpoint (fallback)
No authentication is required for any of these endpoints when reading public data.