Website Image Scraper avatar

Website Image Scraper

Pricing

Pay per event

Go to Apify Store
Website Image Scraper

Website Image Scraper

Extract all image URLs from any website — alt text, dimensions, srcset, and CSS background images. Works on both static and JavaScript-rendered pages.

Pricing

Pay per event

Rating

0.0

(0)

Developer

BowTiedRaccoon

BowTiedRaccoon

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Share

Extract every image URL from any website. Give it a URL, get back a list of images with alt text, dimensions, srcset candidates, and CSS background-image URLs. Works on both static and JavaScript-rendered pages.

What it does

Points a Playwright browser at your URL, lets the page fully render, then pulls every image it can find — <img> tags, <picture>/<source> elements, lazy-load data-src attributes, and background-image CSS rules. Returns one record per image with the metadata that's actually useful.

Optionally follows internal links up to a configurable depth, so you can audit images across an entire section of a site rather than just one page.

Output

Each record contains:

FieldDescription
image_urlAbsolute URL of the image
page_urlURL of the page where the image was found
alt_textAlt text (empty string if none)
widthWidth attribute value (empty string if not set in HTML)
heightHeight attribute value (empty string if not set in HTML)
srcsetRaw srcset attribute value
srcset_urlsComma-separated absolute URLs parsed from srcset
loadingLoading attribute (lazy, eager, or empty string)
source_tagSource element: img, source, or css-background
scraped_atISO 8601 timestamp

Input

ParameterTypeDefaultDescription
urlstringWebsite URL to extract images from
maxItemsinteger200Maximum image records to return. Set to 0 for unlimited
crawlLinksbooleanfalseFollow internal links to scrape images from multiple pages
maxDepthinteger1Max depth for internal link crawling (1–3)

Usage notes

On JavaScript-rendered sites: Images loaded via lazy-loading, React, Vue, or Angular hydration are fully captured. The actor waits for the page network to idle before extracting, which catches most dynamic content.

On srcset: Multi-resolution images are captured as both the raw srcset string and a parsed comma-separated list of absolute URLs. The image_url field holds the primary src value.

On CSS backgrounds: Computed background-image styles are walked across all visible elements. Inline data URIs are filtered out — the output contains only linkable image URLs.

On depth: With crawlLinks: true and maxDepth: 1, the actor crawls the start page plus any internal pages linked from it. Each page is only visited once. Fan-out is capped at 30 new links per page to avoid runaway crawls.

Example output

{
"image_url": "https://books.toscrape.com/media/cache/fe/72/fe72f0532301ec28892ae79a629a293c.jpg",
"page_url": "https://books.toscrape.com",
"alt_text": "A Light in the Attic",
"width": "",
"height": "",
"srcset": "",
"srcset_urls": "",
"loading": "",
"source_tag": "img",
"scraped_at": "2026-06-10T12:00:00.000Z"
}

Questions or issues?

Use the feedback fields in the input form or reach out at actor-support@orbtop.com.