Pricing

from $0.67 / 1,000 images

Bulk Image Downloader: 22-Field Metadata, SHA-256 & ZIP

Download every image from any webpage or direct image URL. Smart srcset picks the highest-resolution variant. 22 metadata fields per image: width, height, format, SHA-256, dedup flag, EXIF, provenance. ZIP and S3 outputs, webhooks, MCP-ready. $2.00 per 1k.

Pricing

from $0.67 / 1,000 images

Rating

0.0

(0)

Developer

GetAScraper

Actor stats

Bookmarked

Total users

Monthly active users

4 days ago

Last modified

🖼️ Bulk Image Downloader: 22-Field Metadata, SHA-256 & ZIP

22 metadata fields per image, SHA-256 content hash, optional EXIF strip and WebP-to-PNG, ZIP and S3 outputs. $2.00 per 1,000 results. 70% cheaper than the top Store alternative. Download every image from any webpage or direct image URL in one call. 50 images per run are free.

This Actor is a generic image downloader. It works on any public URL. Pass it a list of webpages and it discovers every image via HTML <img>, <picture>, srcset, og:image, and twitter:image. Pass it a list of direct image URLs and it downloads them straight. Picks the highest-resolution variant from any srcset automatically. Hashes every image body with SHA-256 for dedup. Strips EXIF or converts WebP to PNG on demand. Exports as a structured dataset, ZIP archive, or S3 upload. Processes 10,000 URLs per run at up to 10 concurrent downloads.

💡 What can you do with it?

You are building an AI training dataset. Pull thousands of product photos, real estate shots, or stock images for CLIP, DINOv2, or SigLIP. Auto-hash for dedup means you never train on the same image twice.
You are a scraper developer. Hand the Actor a list of image URLs returned by your catalog scraper (REI, IndiaMART, eBay, Poshmark) and get back a ZIP of the binaries plus a clean metadata dataset. One Actor replaces three.
You are an e-commerce operator. Mirror product image catalogs. Detect when a competitor swaps an image. Track pricing-page visual changes over time.
You are an archivist or newsroom tool. Grab every image from a story page in one call. Use the per-URL ZIP mode to keep sources separated.
You are a research analyst. Pull the full visual corpus of any public site for content analysis, brand tracking, or visual trend reports.
You are a builder integrating via webhook. The Actor POSTs a JSON summary on completion. Pipe the dataset URL into your BigQuery, Sheets, or n8n pipeline.

🚀 How to use it

Open the Actor in the Apify Store and click "Try for free".
Paste your URLs. Mix webpages (the Actor parses the HTML) and direct image links (it downloads straight) in a single list.
Pick your options. Turn on SHA-256 dedup, EXIF strip, format conversion, or ZIP output as needed.
Click Start. The Actor fetches each URL, discovers or downloads the images, and pushes metadata to the dataset and binaries to the key-value store.
Download your results. Pull the dataset as JSON, CSV, or Excel. Grab the image binaries from the key-value store (links in the dataset's kv_url column). Or use the single-click ZIP download.

📥 Input

Field	Type	Required	Description
`urls`	array	Yes	List of URLs. Each can be a webpage (HTML is parsed for images) or a direct image link. Mix freely.
`mode`	enum	No	`auto` (recommended, detects by extension), `page` (force HTML parse), or `direct` (force image URL).
`includeSrcset`	boolean	No	Discover images from `srcset`, `picture>source`, and lazy `data-src`. Default `true`.
`includeOgTags`	boolean	No	Discover Open Graph and Twitter Card images. Default `true`.
`minWidth`	integer	No	Skip images narrower than this. Default 0.
`minHeight`	integer	No	Skip images shorter than this. Default 0.
`minSizeBytes`	integer	No	Skip images smaller than this. Filters tracking pixels. Default 0.
`maxImagesPerUrl`	integer	No	Cap images per source URL. Default 1000.
`maxUrls`	integer	No	Cap total URLs processed. Default 10000.
`dedupByHash`	boolean	No	Compute SHA-256 of each image body and skip duplicates. Default `true`.
`stripExif`	boolean	No	Re-encode JPEGs without EXIF metadata. Default `false`.
`convertFormat`	enum	No	`none`, `webp-to-png`, or `png-to-jpg`. Default `none`.
`filenamePattern`	string	No	Templated filename using `{slug}`, `{hash}`, `{ext}`, `{idx}`, `{source}`. Default `{slug}-{hash}.{ext}`.
`outputFormat`	array	No	`dataset` (always), `kv-store` (binaries), `zip` (single archive), `zipPerUrl` (one ZIP per source), `s3` (upload to bucket), `webhook` (POST summary on completion).
`s3Bucket`	string	No	Required when `outputFormat` includes `s3`. Uses standard `AWS_*` env vars for credentials.
`webhookUrl`	string	No	URL to receive a JSON run summary on completion.
`maxConcurrency`	integer	No	Max parallel image downloads. Default 10.
`downloadTimeoutMs`	integer	No	Per-image fetch timeout. Default 15000.
`imageCheckMaxRetries`	integer	No	Retries per failed image. Default 3.
`proxyConfiguration`	object	No	Optional proxy. Default off. Use residential if source sites are hotlink-protected.
`failFast`	boolean	No	Stop on first error. Default `false`.
`debugLogging`	boolean	No	Verbose per-image tracing. Default `false`.

📤 Output

The Actor pushes one row to the dataset per downloaded image. Binaries are written to the default key-value store under IMAGES/{filename}. Use the dataset's kv_url column to download each binary.

{
  "filename": "picsum-photos-800x600-a1b2c3d4e5f67890.jpg",
  "source_url": "https://example.com/gallery",
  "image_url": "https://picsum.photos/800/600.jpg",
  "kv_store_key": "IMG-picsum-photos-800x600-a1b2c3d4e5f67890.jpg",
  "kv_url": "https://api.apify.com/v2/key-value-stores/abc/records/IMG-picsum-photos-800x600-a1b2c3d4e5f67890.jpg",
  "content_type": "image/jpeg",
  "size_bytes": 54321,
  "width": 800,
  "height": 600,
  "format": "jpeg",
  "sha256": "a1b2c3d4e5f6789012345678901234567890abcdef1234567890abcdef123456",
  "is_duplicate": false,
  "exif_stripped": false,
  "from_srcset": true,
  "from_picture_source": false,
  "from_og_tag": false,
  "from_twitter_tag": false,
  "from_data_attr": false,
  "from_direct_url": false,
  "downloaded_at": "2026-06-20T12:34:56.000Z",
  "duration_ms": 423,
  "http_status": 200,
  "error": null
}

You can download the dataset in various formats such as JSON, HTML, CSV, or Excel.

📋 Output data fields

Field	Description
`filename`	Final filename (per `filenamePattern`).
`source_url`	The page URL the image was discovered on (or its direct URL).
`image_url`	Final resolved image URL (after srcset expansion, redirects).
`kv_store_key`	Key in the run's key-value store (`IMG-...`).
`kv_url`	Signed download URL for the binary (24-hour default).
`content_type`	MIME type (e.g. `image/jpeg`, `image/webp`).
`size_bytes`	Downloaded size.
`width`	Image width in pixels (from sharp metadata).
`height`	Image height in pixels (from sharp metadata).
`format`	Normalized format: `jpeg`, `png`, `webp`, `gif`, `svg`, `avif`, `bmp`, `ico`, `other`.
`sha256`	Content hash (when `dedupByHash=true`).
`is_duplicate`	True if hash matched a previously-seen image in this run.
`exif_stripped`	True if JPEG was re-encoded to remove EXIF.
`from_srcset`	True if discovered via `srcset` / `picture` / `data-srcset`.
`from_picture_source`	True if discovered via `<picture><source>`.
`from_og_tag`	True if discovered via `<meta og:image>`.
`from_twitter_tag`	True if discovered via `<meta twitter:image>`.
`from_data_attr`	True if discovered via lazy `data-src` / `data-srcset`.
`from_direct_url`	True if the URL was treated as a direct image (mode=direct/auto).
`downloaded_at`	ISO timestamp of the download.
`duration_ms`	Time to fetch + process.
`http_status`	HTTP response code (0 on network error).
`error`	Per-image error string (`404`, `timeout`, `below-min-size-N`, etc.) or `null`.

💰 Pricing

$2.00 per 1,000 results. The first 50 results of every run are free. There is no monthly fee and no proxy surcharge.

Volume	What you pay
50 images (free trial)	$0.00
1,000 images	$2.00
10,000 images	$20.00
100,000 images	$200.00

For comparison, the next-most-popular bulk image downloader on the Store (onescales/bulk-image-downloader) charges $7.00 per 1,000 URLs and only ships image bytes (no width, no height, no hash, no format). We charge 70% less and ship the richest schema in the field.

For scheduled or standby runs, pricing drops to $1.00 per 1,000 results (50% off). Volume runs of more than 50,000 images are eligible for $1.50 per 1,000.

⭐ Enjoying Bulk Image Downloader?

⭐ ⭐ ⭐ ⭐ ⭐
Save hours of manually right-clicking and saving images one by one from any site.
A 5-star rating takes 10 seconds and helps other AI dataset builders and e-commerce operators find it. Your feedback also tells us what to build next.

★ Rate this Actor on Apify

🛠️ Tips and advanced options

Set includeSrcset to false if you only want the page's primary images. This skips lazy data-src and responsive variants, which is faster on heavy pages.
Use minSizeBytes to filter tracking pixels. A typical tracking pixel is under 1KB. Set minSizeBytes: 2000 to skip them.
Use minWidth and minHeight to focus on useful images. Set minWidth: 400 to skip thumbnails and avatars.
Pick the right output mode. zip for a single archive, zipPerUrl to keep source pages separated, s3 to push directly to your training bucket.
Pair with a catalog scraper. Run one of our catalog scrapers (REI, IndiaMART, eBay) first, then feed the image URLs to this Actor for a complete e-commerce dataset.
Schedule weekly runs to refresh your image corpus. Most product catalogs update slowly; daily is overkill.
Use SHA-256 dedup across runs. Hashes are stable, so a daily run that re-discovers the same images will mark them as is_duplicate: true and skip the KV write.

❓ FAQ

Is this Actor legal to use? The Actor downloads images that are publicly accessible. You are responsible for ensuring your use case complies with the source site's Terms of Service and applicable copyright laws. Do not use the Actor to bypass access controls, scrape private content, or violate copyright.

Why does it work on any site? The Actor is generic. It fetches the URL you give it, parses the HTML for image tags, and downloads the images it finds. There is no per-site configuration.

Does it execute JavaScript? No. Single-page apps that render images via React/Vue hydration will return an empty image list. If your target site is a SPA, use a Playwright-based scraper first to get the image URLs, then pass them to this Actor with mode: 'direct'.

Do I need a proxy? No. Most public sites serve images to any client. Default useApifyProxy: false works perfectly. If your source site is hotlink-protected, set residential proxy as an opt-in via the proxyConfiguration field.

What is the largest image it can handle? Sharp auto-streams, so peak memory is around 5x the size of the largest single image. A 50MB image is fine. A 500MB image may cause memory pressure on smaller container sizes.

Does the EXIF strip work on PNG or WebP? No, EXIF strip is JPEG-only. PNG metadata stripping is a v2 feature.

How does the free trial work? Every new Apify user gets $5 of platform credit. That is enough to run this Actor many times. The first 50 results of every run are free, so you can evaluate the data quality before spending anything.

Can I get a single ZIP of all images? Yes. Set outputFormat: ['dataset', 'kv-store', 'zip']. The ZIP is written to OUT-images.zip and is also linked in the dataset summary.

Can I push directly to S3? Yes. Set outputFormat: ['dataset', 's3'], fill in s3Bucket, and set AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_REGION as Apify Secrets. Each image uploads to s3://{bucket}/images/{filename}.

Can I get a webhook on completion? Yes. Set outputFormat: ['dataset', 'webhook'] and fill in webhookUrl. The Actor POSTs a JSON summary with run stats (counts, errors, total size) to the URL when the run finishes.

🛡️ Disclaimers and support

Disclaimer: This Actor retrieves publicly accessible images. Make sure your usage complies with the source site's terms of service and applicable copyright laws. The Actor is a generic utility and does not bypass authentication, paywalls, or access controls.
Support: Open an issue from the Issues tab for bug reports or feature requests. Custom scrapers and integration help are available on request.

🔗 Other actors

Google Lens OCR API: Sub-second Image to Text ↗ - extracts text from any image via Google Lens OCR.
Google Lens Search: Reverse Image Finder & OCR ↗ - runs reverse image search and visual matches through Google Lens.
Loom Video Downloader: MP4, transcript and metadata in one run ↗ - downloads Loom videos as MP4 with transcript and metadata.
Streamable Video Downloader: Direct MP4 links in seconds ↗ - resolves direct MP4 download links from Streamable.
Loom Transcript Scraper ↗ - pulls transcripts, captions, and metadata from public Loom videos.

Bulk Image Downloader

thirdwatch/bulk-image-downloader

Download every image from any webpage or direct image URL - at scale. Smart srcset handling picks the highest-resolution variant. Optional sha256 dedup, EXIF stripping for privacy, and minimum size/width filters.

Thirdwatch

Hash Generator

web.harvester/hash-generator

Generate MD5, SHA-1, SHA-256, SHA-512 hashes for text or files. Verify integrity, create checksums, deduplicate content.

Web Harvester

Google Images API - $0.10 per 1,000 - Fast Image API

johnvc/google-images-api

Scrape Google Images at scale and export clean JSON: image URL, width and height, thumbnail, source site, domain, and the page link. Bulk image search for SEO research, training datasets, and AI agents. Pay per image from $0.10 per 1,000 results, with no setup or per-run fee.

John

5.0

image to image

evoort-solutions-llc/image-to-image

Evoort Solutions LLC

Dataset Image Downloader & Uploader

lukaskrivka/images-download-upload

Download image files from image URLs in your datasets and save them to a Zip file, Key-Value store, or directly your AWS S3 bucket.

Lukáš Křivka

1.7K

5.0

Bulk Image Downloader

onescales/bulk-image-downloader

The Bulk Image Downloader is a powerful Apify actor that extracts and downloads images from web pages or processes direct image URLs in bulk. Whether you need to download a single image or thousands of images from multiple websites, this tool handles it all efficiently.

One Scales

1.3K

5.0

Hash Generator

automation-lab/hash-generator

This actor generates cryptographic hashes for input strings using MD5, SHA-1, SHA-256, and SHA-512 algorithms. Useful for data integrity checks, password hash generation, content fingerprinting, and deduplication.

Stas Persiianenko

✨ Instagram Post & Image Scraper, Image Post Downloader

scrapearchitect/instagram-post-image-scraper-image-post-downloader

Scrape Image, Post Metadata & download HD images 🖼️ from any Instagram post! Get FULL metadata: likes ❤️ comments 🗨️ captions 📝 hashtags #, user info 👤 AND deep technical image data 🔍 (codec, dimensions, bitrate!). ✅ marketing & analysis ✨ Instagram Post & Image Scraper, Image Post Downloader

Scrape Architect

Image Downloader

apify/image-downloader

Apify

512

4.6

Bulk Image Downloader

trudax/bulk-image-downloader

Download all images from a website with this easy-to-use Bulk Image Downloader. Scrape all images from any website by URL to a zip file with a single click.