E-Commerce AI Training Dataset from Product Pages
Extract high-resolution images and metadata from product pages. Receive detailed datasets for AI training, including SHA-256 hashes and EXIF info.
Bulk Image Downloader: 22-Field Metadata, SHA-256 & ZIPgetascraper/bulk-image-downloader
Filename
Source Page
Image URL
Content-Type
+11 fieldsTextNumberBooleanListObject
Input
URLs to Process(required)
url:https://www.rei.com/product/248622/patagonia-mens-nano-puff-jacket+2
URL Mode:page
Include srcset / picture:true
Include og:image / twitter:image:true
Min Width (px):400
Min Height (px):400
Min Size (bytes):10000
Max Images per URL:1000
Max URLs to Process:10000
Deduplicate by Content Hash:true
Strip EXIF Metadata (JPEG):true
Format Conversion:webp-to-png
Filename Pattern:{source}-{idx}-{hash}.{ext}
Output Format:dataset+2
Max Concurrency:10
Download Timeout (ms):15000
Max Retries per Image:3
Proxy Configuration
Fail Fast:false
Verbose Debug Logs:false
Output fields
Filename
Source Page
Image URL
Content-Type
Format
Width
Height
Size (KB)
Duplicate
EXIF Stripped
From srcset
From og:image
Download Binary
Error
Downloaded
Sign up on Apify01
Create your Apify account to access the Bulk Image Downloader: 22-Field Metadata, SHA-256 & ZIP.
Start the run02
The Actor will start running based on the input automatically.
Receive the output03
Monitor the progress in real-time. You will be notified as soon as your dataset is complete and ready for review.
Integrate into your workflow04
The final output is delivered in JSON, CSV, or Excel format, ready to be plugged into your workflow.

