Pricing

from $1.00 / 1,000 results

Try for free

Go to Apify Store

Website Image Scraper

Try for free

Extract every image URL from a website. Crawls the start page (and optionally internal links up to a configurable depth), parses `<img>` tags, `<picture>`/`<source>`, `srcset` candidates, and CSS `background-image` declarations. HTTP-only, no proxy or browser needed.

Pricing

from $1.00 / 1,000 results

Rating

5.0

(23)

Developer

Crawler Bros

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

What it does

Pull every image URL referenced on a page — <img src>, lazy-loaded data-src, srcset candidates, picture sources, favicons, inline style="background-image: url(...)".
Crawl deeper — follow internal links up to maxCrawlDepth (same host only) to grab images from linked pages too.
Filter by format — restrict to specific extensions (e.g. only SVG, only WebP/AVIF).
Bounded — maxImagesPerPage and maxTotalImages keep runs cost-predictable on large galleries.

Input

Field	Type	Default	Description
`startUrl`	string (required)	`https://apify.com`	Page to start crawling. Must be `http://` or `https://`.
`maxCrawlDepth`	integer	`1` (0–5)	0 = only the start URL; 1+ = follow internal links one level (same host only).
`maxImagesPerPage`	integer	`200` (1–5000)	Cap per page — keeps pathological galleries bounded.
`maxTotalImages`	integer	`1000` (1–50000)	Hard cap on total images emitted across the whole run.
`imageExtensions`	array	`[jpg, jpeg, png, gif, webp, svg, avif, bmp, ico]`	Only URLs whose path ends in one of these are kept.
`includeBackgroundImages`	boolean	`true`	Also extract from inline `style="background-image: url(...)"`.
`userAgent`	string	(Chrome 131)	Optional UA override.

Example input

{
  "startUrl": "https://apify.com",
  "maxCrawlDepth": 1,
  "maxImagesPerPage": 200,
  "maxTotalImages": 500,
  "imageExtensions": ["jpg", "png", "webp", "svg"],
  "includeBackgroundImages": true
}

Output

One record per unique image URL. Empty fields are omitted (no nulls).

{
  "url": "https://apify.com/static/hero.jpg",
  "sourcePage": "https://apify.com/",
  "pageTitle": "Apify · The full-stack web-scraping & automation platform",
  "alt": "Apify hero image",
  "hasAltText": true,
  "title": "Apify",
  "width": 1200,
  "height": 600,
  "extension": "jpg",
  "discoveredVia": "img-tag",
  "mimeTypeHint": "image/jpeg",
  "crawlDepth": 0,
  "scrapedAt": "2024-12-16T14:23:11+00:00"
}

Output fields

url — absolute URL of the image (data: URIs and javascript: pseudo-URLs are filtered out).
sourcePage — the page where the image was discovered.
pageTitle — <title> of the page where the image was found (handy for grouping the dataset by page name).
alt — alt attribute of the <img> tag (when present).
hasAltText — derived boolean: true when alt is present and non-empty. Lets you filter accessibility issues without testing for field presence.
title — title attribute (when present).
width / height — explicit pixel dimensions from the tag (only emitted when numeric).
extension — lowercase file extension parsed from the URL path (e.g. "jpg", "svg", "webp"). Useful for format-bucket aggregations.
discoveredVia — one of img-tag, srcset, picture-source, link-icon, css-background.
mimeTypeHint — derived from the file extension (e.g. image/png, image/svg+xml).
crawlDepth — depth at which the page was crawled (0 = startUrl).
scrapedAt — ISO-8601 timestamp.

Use cases

Content audits — see every image a website serves up, broken down by source (img tag vs CSS background).
Asset inventory — pull all logos, hero images, and icons from a competitor or brand site.
Format migration — find every JPEG/PNG to convert to WebP/AVIF, or every PNG to convert to SVG.
SEO / accessibility — list images with hasAltText: false to flag accessibility issues at a glance.

FAQ

Does it download the image binaries? No. The actor only collects URLs and metadata. Combine with a separate downloader (or pipe URLs into Apify's standard "URL list" actor) if you need the bytes.

Does it work on JavaScript-rendered pages? Mostly no. This scraper is HTTP-only — it sees the server-rendered HTML, not what runs after the page boots. If a site lazy-loads images via React/Vue, you may only see fallback / placeholder images. For SPA-rendered content, use a Playwright-based actor instead.

Can I limit it to a single page? Set maxCrawlDepth: 0. Only the start URL is fetched.

Does it follow external links? No. Internal-link crawling only follows links to the same host as startUrl to keep cost and scope bounded.

What if the site has no images at all? You get a single sentinel record {"type": "website_image_scraper_error", "reason": "no_images_found"} so the dataset is non-empty. The run still completes successfully.

How does it deduplicate? By absolute URL. The same image referenced from multiple pages produces one record (the first-seen page is recorded as sourcePage).

Website Image Scraper

gomorrhadev/website-image-scraper

Website Image Scraper is a fast, lightweight tool that crawls websites to extract image URLs (jpg, png, svg) without downloading files or using browsers. It supports recursive crawling, respects robots.txt, and efficiently collects image links for analysis or monitoring or a later download.

F. Gutz

295

5.0

Image Scraper

rapidtech1898/image-scraper

Extract image links from any website quickly and easily. Enter a URL and the scraper collects all available image URLs in seconds. Perfect for designers, marketers, and developers who need fast access to image sources without manual searching.

Max Pohler

1.0

Website Image Downloader Pro

powerful_bachelor/website-image-downloader-pro

📸 Website Image Downloader Pro: Extract and download images from any URL! 🚀 Features include image URL extraction, SVG to PNG conversion, downloading, and zipping images. Perfect for market research, AI training, and creating visual archives. 🌐✨ Try it now on Apify! 💾

Powerful Bachelor

508

2.5

Google Images Scraper

hooli/google-images-scraper

Scrape image details from images.google.com. Add your query and number of images and extract image details such as image URL, image source, description, image dimensions, thumbnail, and more. Export scraped data, run the scraper via API, schedule and monitor runs, or integrate with other tools.

Hooli

4.4K

4.8

Bulk Image Downloader

onescales/bulk-image-downloader

The Bulk Image Downloader is a powerful Apify actor that extracts and downloads images from web pages or processes direct image URLs in bulk. Whether you need to download a single image or thousands of images from multiple websites, this tool handles it all efficiently.

One Scales

977

5.0

Bulk Image Downloader

trudax/bulk-image-downloader

Download all images from a website with this easy-to-use Bulk Image Downloader. Scrape all images from any website by URL to a zip file with a single click.

Trudax

3.4K

5.0

Wayback Machine Search

crawlerbros/wayback-machine-search

Query Internet Archive's Wayback Machine for historical snapshots of any URL or domain. Filter by date, HTTP status, MIME type, and deduplicate. Optionally fetch the archived page text. Free public CDX API, no authentication.

Crawler Bros

5.0

Reddit Profile Crawler Pro

crawlerbros/reddit-profile-crawler-pro

Scrape Reddit user profiles with split karma (post/comment/awarder/awardee), account age, admin/employee/moderator badges, trophies, moderated subreddits, and recent comments. Per-post: awards, gilded count, upvote ratio, media, crosspost parent. Filter by score, age, NSFW. No login.

Crawler Bros

5.0

Reddit Comment Scraper Pro

crawlerbros/reddit-comment-scraper-pro

Scrape comments from any Reddit post with advanced filters (minScore, maxDepth, excludeDeleted, authorFilter, keywordFilter) and rich per-comment fields: awards, gildedCount, controversiality, repliesCount, parentCommentId, body, bodyHtml, subreddit, permalink. No login required.

Crawler Bros

5.0

Reddit Scraper Pro

crawlerbros/reddit-scraper-pro

Scrape Reddit subreddit posts with advanced filters (keywordFilter, minScore, maxAgeDays, excludeStickied, excludeNsfw, authorBlocklist, domainAllowlist/Blocklist). Adds is_video / awards / gilded / upvote_ratio / media_metadata to every post. Browser-based, no login.

Crawler Bros

5.0