Web Images Scraper
Pricing
from $6.30 / 1,000 processed image urls
Web Images Scraper
Extract image URLs from public webpages, domains, and direct image links. Get source pages, discovery methods, metadata, and optional saved files or ZIP archives.
Pricing
from $6.30 / 1,000 processed image urls
Rating
0.0
(0)
Developer
Maxime Dupré
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share
🖼️ What is Web Images Scraper?
Web Images Scraper extracts image URLs from public webpages, domains, and direct image links. Use it to find webpage images, responsive srcset candidates, Open Graph images, icons, CSS background images, and optional saved image files without opening each page by hand.
Start with the prefilled Wikimedia Commons example or paste your own public website URL. Keep downloads off for a quick metadata-only run, then enable image saving or ZIP archives when you need files in Apify key-value storage.
🔎 What can Web Images Scraper do?
- Extract images from
imgtags,srcset, page metadata, icons, inline styles, and linked stylesheets. - Accept public webpage URLs, bare domains, and direct image URLs.
- Crawl same-domain links when you raise the crawl depth and page limit.
- Filter images by file extension, minimum known byte size, and URL text.
- Save one dataset item per accepted image as soon as it is accepted.
- Optionally download images and create one ZIP archive per input target.
- Export results through Apify datasets, API, webhooks, schedules, and integrations.
📦 What data can it extract?
| Data point | What it means |
|---|---|
| Input URL | The webpage, domain, or direct image URL you submitted. |
| Source page URL | The page where the image was discovered. |
| Image URL | The original image URL found on the page. |
| Normalized image URL | A cleaned absolute URL for matching and exports. |
| Filename and extension | File naming details when they can be inferred. |
| Content type and file size | Metadata from the image response when available. |
| Alt and title text | Nearby accessibility and title text from the page. |
| Discovery method | Whether the image came from img, srcset, metadata, CSS, or a direct URL. |
| Crawl details | Page index, crawl depth, source order, and scrape timestamp. |
| Saved file links | Apify storage links when image downloads are enabled. |
| ZIP file links | Archive metadata when ZIP creation is enabled. |
🧭 How do I scrape images from a website?
- Add one or more public webpage URLs, domains, or direct image URLs.
- Choose how many images to keep per page.
- Keep
Crawl depthat0for only the submitted page, or raise it to follow same-domain links. - Leave discovery options enabled unless you want to exclude
srcset, metadata, or CSS background images. - Turn on
Save image files to Apify storageif you need downloadable files. - Turn on
Create ZIP filesonly when downloads are enabled and you want one archive per input target. - Run the Actor and export the dataset as JSON, CSV, Excel, XML, or through the Apify API.
⚙️ Input options
The main input is Webpage or image URLs. You can paste values such as:
[{ "url": "https://www.python.org/" },{ "url": "https://www.python.org/static/img/python-logo.png" }]
Use Max images per page to cap how many image rows are saved from each page. Use Max pages per input and Crawl depth together when you want a small same-domain crawl instead of a single-page extraction.
The filter options are useful for bulk image downloader workflows. For example, keep only png and webp extensions, require image URLs to contain /uploads/, or exclude URLs containing logo or icon. Images without a known byte size are kept unless another filter removes them.
🧾 Output example
Each dataset item is one accepted image:
{"inputIndex": 0,"inputUrl": "https://www.python.org/","sourcePageUrl": "https://www.python.org/","imageUrl": "https://www.python.org/static/img/python-logo.png","normalizedImageUrl": "https://www.python.org/static/img/python-logo.png","filename": "python-logo.png","extension": "png","contentType": "image/png","fileSizeBytes": 15782,"width": null,"height": null,"altText": "python logo","titleText": null,"discoveryMethod": "img-src","sourceOrder": 1,"pageIndex": 1,"crawlDepth": 0,"isDirectImage": false,"duplicateKey": "a1b2c3","scrapedAt": "2026-06-04T00:00:00.000Z"}
When downloads are enabled, rows also include downloadUrl and savedFile. When ZIP archives are enabled, rows also include zipFile.
💳 How much does Web Images Scraper cost?
This Actor uses pay-per-event pricing. You are charged for each input webpage or direct image URL that is successfully processed for image extraction, not for every image row saved.
For lower-cost first tests, run one or two targets with downloads disabled. Enabling downloads and ZIP archives can use more storage and runtime because the Actor has to fetch and save each accepted image file.
⚠️ Limits and caveats
Web Images Scraper is for public webpages and public image URLs. It does not log in, use cookies, submit forms, bypass private content, or search Google Images. Some websites block automated traffic, return temporary errors, lazy-load images in ways that are not present in the page HTML, or hide assets behind scripts that are not exposed as normal image URLs.
The Actor uses Apify Proxy by default. Invalid, blocked, or empty targets are handled gracefully, so one difficult website does not have to fail the whole run.
❓ FAQ
Can it download all images from a website?
It can crawl same-domain links up to your selected depth and page limit, then save accepted images from those pages. Keep the first run small, review the output, and raise limits only when the results match what you need.
Can I use direct image URLs?
Yes. A direct image URL is accepted as an input target and saved as one image row. If downloads are enabled, the file can also be saved to Apify storage.
Does it include CSS background images?
Yes, when Include CSS background images is enabled. It checks inline styles and linked stylesheets for image URLs.
Does it preserve duplicate images?
The output includes a duplicateKey so equivalent image URLs can be recognized in exports. The Actor keeps one accepted row for each discovered image identity in the run.
📝 Changelog
- 0.0: Initial release.
🆘 Support
For issues, questions, or feature requests, file a ticket and I'll fix or implement it in less than 24h 🫡
🔗 Other actors
- Website URL Crawler ↗ - crawl websites and export discovered URLs for audits or crawl planning.
- Unsplash Image Scraper ↗ - scrape Unsplash image search results by keyword.
- Twitter Media Scraper ↗ - extract public X/Twitter image, video, and GIF URLs.
- Facebook Media Downloader ↗ - download public Facebook videos and reels.
- Instagram Downloader API ↗ - extract media URLs from public Instagram posts and reels.
Made with ❤️ by Maxime Dupré