Pricing

from $0.70 / 1,000 results

Bulk Image Downloader

Download every image from any webpage or direct image URL - at scale. Smart srcset handling picks the highest-resolution variant. Optional sha256 dedup, EXIF stripping for privacy, and minimum size/width filters.

Pricing

from $0.70 / 1,000 results

Rating

0.0

(0)

Developer

Thirdwatch

Actor stats

Bookmarked

Total users

Monthly active users

24 days ago

Last modified

What you get

Give it a list of URLs and get back every image plus rich metadata: width, height, format, content type, byte size, and a SHA-256 hash for dedup. The actor handles both modes — point it at a webpage and it parses the HTML for <img>, <picture>, srcset, and Open Graph tags; or pass a direct image URL and it downloads it straight. Auto-mode picks the right path per URL by inspecting Content-Type. Highest-resolution variants from srcset are picked automatically. Files are saved to the run's key-value store and each dataset row carries a ready-to-use download URL.

Bulk download images from any URL

A single tool for collecting images at scale across hundreds or thousands of pages. No browser, no JavaScript rendering — fast HTTP fetch with up to 10 concurrent connections per run. Supports webpages, direct image URLs, responsive srcset images, and Open Graph / Twitter card images.

Image dataset builder for ML

Built for teams assembling training corpora. Use dedupByHash to skip identical images across pages, minWidth and minSizeBytes to filter out tracking pixels and tiny thumbnails, and maxImagesPerUrl to keep first runs cheap. Every image lands in the run's key-value store with a stable kv_key and a direct download URL on each dataset row, so you can pipe results straight into a downstream training pipeline.

Output fields

Field	Description
`source_url`	The input URL the image was discovered on
`image_url`	Absolute URL of the image file
`kv_store_key`	Key under which the image is stored in the run's key-value store
`kv_url`	Fully formed Apify API URL to download the image
`filename`	Suggested filename (sequence number + short hash + extension)
`content_type`	HTTP `Content-Type` (e.g., `image/jpeg`, `image/png`, `image/webp`)
`size_bytes`	File size in bytes
`sha256`	SHA-256 hash of the image bytes (used for dedup)
`width`	Image width in pixels
`height`	Image height in pixels
`format`	Decoded image format (JPEG, PNG, WEBP, GIF, etc.)
`from_srcset`	`true` if extracted from a responsive `srcset`
`downloaded_at`	ISO-8601 timestamp when the image was fetched

Example output

{
  "source_url": "https://en.wikipedia.org/wiki/Cat",
  "image_url": "https://upload.wikimedia.org/wikipedia/commons/3/3a/Cat03.jpg",
  "kv_store_key": "000007_3a1f9b2c.jpg",
  "kv_url": "https://api.apify.com/v2/key-value-stores/abc123/records/000007_3a1f9b2c.jpg",
  "filename": "000007_3a1f9b2c.jpg",
  "content_type": "image/jpeg",
  "size_bytes": 432109,
  "sha256": "3a1f9b2c4d5e6f7081929394a5b6c7d8e9f0a1b2c3d4e5f60718293a4b5c6d7e",
  "width": 1280,
  "height": 853,
  "format": "JPEG",
  "from_srcset": true,
  "downloaded_at": "2026-05-04T10:11:12.345678+00:00"
}

Input parameters

Parameter	Required	Description
`urls`	Yes	List of webpage URLs to scan for images, or direct image URLs to download.
`mode`	No	`auto` (default), `page`, or `direct`. Auto inspects Content-Type; page parses HTML; direct treats every URL as an image.
`includeSrcset`	No	Pull images from responsive `srcset` and `<picture>` elements. Default `true`.
`minWidth`	No	Skip images smaller than this width when dimensions are known. Default `0` (disabled).
`minSizeBytes`	No	Skip files smaller than this many bytes (filters tracking pixels). Default `0` (disabled).
`maxImagesPerUrl`	No	Cap on images downloaded per input URL. Default `100`, max `1000`.
`dedupByHash`	No	Skip duplicates whose SHA-256 hash matches one already downloaded. Default `true`.
`stripExif`	No	Re-encode JPEGs without EXIF metadata for privacy. Default `false`.
`proxyConfiguration`	No	Optional proxy. Most public sites do not require one.

Use cases

ML engineers: build image datasets for training and fine-tuning by harvesting category, search, or gallery pages.
Site auditors: measure image bloat across a domain — count files, total bytes, and average size per page.
E-commerce teams: scrape product image catalogs for migration, redesign, or competitive research.
Archivists: snapshot every image from a portfolio, gallery, or news site for offline backup.
Privacy-conscious publishers: strip EXIF metadata from user-submitted images before re-publishing.

Limitations

stripExif only re-encodes JPEGs (the most common EXIF carrier). Re-encoding is light-touch but technically lossy.
Inline data: URIs are skipped; only fetchable URLs are downloaded.
Sites that gate images behind a login or short-lived signed URLs cannot be scraped.
Maximum image count per page is capped at 1,000 to protect runtime; use maxImagesPerUrl to keep first runs cheap.
The actor uses HTTP fetch only — pages that render images via JavaScript after load (some SPAs) may not expose all images.

Compared to alternatives

vs. generic Apify image downloaders: this actor returns full per-image metadata (width, height, format, hash, byte size) on the dataset row, plus a ready-to-use kv_url so you don't have to construct download URLs yourself. Built-in dedup, srcset handling, and EXIF stripping save a separate post-processing pass.
vs. running wget or a custom script: hosted, retried, deduplicated, and observable in the Apify Console with no infrastructure to maintain.

FAQ

Where are the images stored? Every image is saved as a record in the run's default key-value store. Each dataset row carries a kv_url you can fetch directly, or you can browse the run's "Storage" tab in the Apify Console.

Can I download images directly without scraping a webpage? Yes — set mode: "direct", or leave mode: "auto" and pass image URLs. The actor will skip HTML parsing and download each URL as an image.

Will it follow links to other pages? No. The actor only downloads images from the URLs you provide. To crawl, run a sitemap or link-extraction scraper first and feed the URLs in.

Does it handle responsive images correctly? Yes. With includeSrcset: true (default), the highest-resolution variant from each srcset or <picture> element is selected.

Does it dedupe across pages? Yes. With dedupByHash: true (default), an image already downloaded in this run is skipped on later pages.

How do I keep my first run cheap? Lower maxImagesPerUrl (e.g., 10 or 20) and run on a handful of URLs first to validate output before scaling.

Last verified: 2026-05

More scrapers at thirdwatch.dev.

Bulk Image Downloader

onescales/bulk-image-downloader

The Bulk Image Downloader is a powerful Apify actor that extracts and downloads images from web pages or processes direct image URLs in bulk. Whether you need to download a single image or thousands of images from multiple websites, this tool handles it all efficiently.

One Scales

982

5.0

image to image

evoort-solutions-llc/image-to-image

Evoort Solutions LLC

Image Downloader

apify/image-downloader

Apify

494

1.0

✨ Instagram Post & Image Scraper, Image Post Downloader

scrapearchitect/instagram-post-image-scraper-image-post-downloader

Scrape Image, Post Metadata & download HD images 🖼️ from any Instagram post! Get FULL metadata: likes ❤️ comments 🗨️ captions 📝 hashtags #, user info 👤 AND deep technical image data 🔍 (codec, dimensions, bitrate!). ✅ marketing & analysis ✨ Instagram Post & Image Scraper, Image Post Downloader

Scrape Architect

Google Images Scraper

hooli/google-images-scraper

Scrape image details from images.google.com. Add your query and number of images and extract image details such as image URL, image source, description, image dimensions, thumbnail, and more. Export scraped data, run the scraper via API, schedule and monitor runs, or integrate with other tools.

Hooli

4.5K

4.8

Image Scraper

rapidtech1898/image-scraper

Extract image links from any website quickly and easily. Enter a URL and the scraper collects all available image URLs in seconds. Perfect for designers, marketers, and developers who need fast access to image sources without manual searching.

Max Pohler

1.0

Pinterest Image Downloader

zerobreak/pinterest-image-downloader

Pinterest image downloader that extracts full-resolution photos from any pin or board URL, so designers and marketers can collect visual assets without clicking through Pinterest one image at a time.

ZeroBreak

Website Image Scraper

crawlerbros/website-image-scraper

Extract every image URL from a website. Crawls the start page (and optionally internal links up to a configurable depth), parses `<img>` tags, `<picture>`/`<source>`, `srcset` candidates, and CSS `background-image` declarations. HTTP-only, no proxy or browser needed.

Crawler Bros

5.0

Bulk Image Downloader

trudax/bulk-image-downloader

Download all images from a website with this easy-to-use Bulk Image Downloader. Scrape all images from any website by URL to a zip file with a single click.

Trudax

3.4K

5.0

AI Image Captioner

seemuapps/image-captioner

Generate accurate text descriptions for any image using AI — bulk caption product photos, screenshots, or any image URL for SEO, accessibility, and content tagging.