Website Logo Product Image Banner Extractor
Pricing
$9.99/month + usage
Website Logo Product Image Banner Extractor
Extract Logos & brand marks (including favicons), Product images and catalog thumbnails, Hero / banner images (headers, mastheads), Team photos, avatars, profile pictures, Social media graphics (Open Graph, Twitter cards), Icon sets (SVG, PNG, touch icons)
Pricing
$9.99/month + usage
Rating
0.0
(0)
Developer

BotFlowTech
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
8 days ago
Last modified
Categories
Share
🎨 Enhanced Visual Asset Extraction Suite
Extract all visual assets from any website with intelligent categorization and rich metadata, built as a Python Apify Actor for e‑commerce, design, and automation workflows.
🚀 What this actor does
This actor crawls a list of website URLs and extracts visual assets such as:
- 🏢 Logos & brand marks (including favicons)
- 🛍️ Product images and catalog thumbnails
- 🎯 Hero / banner images (headers, mastheads)
- 👥 Team photos, avatars, profile pictures
- 📱 Social media graphics (Open Graph, Twitter cards)
- 🎨 Icon sets (SVG, PNG, touch icons)
- 🖼️ Gallery / slider / carousel images
For each asset, it attempts to infer:
- Category (logo, product, hero, team, social, icon, gallery, other)
- Source (HTML tag / location where it was found)
- Format (SVG/PNG/JPG/WebP/AVIF/ICO/…)
- Optional dimensions (width, height) when enabled
🧠 How it works
The actor:
- Downloads the HTML of each URL.
- Parses the page and extracts images from:
<img>tags (including lazy‑loading attributes likedata-src)<meta property="og:image">andname="twitter:image"<link rel="icon">, Apple touch icons, and default/favicon.ico<picture>elements andsrcsetattributes- Inline CSS
background-image: url(...)styles
- Converts relative URLs to absolute.
- Applies a rule‑based classifier that looks at:
- Image URL
alttext- CSS classes
- Local DOM context
to categorize each asset.
- Optionally fetches images and uses Pillow to read true width, height, and format.
- De‑duplicates assets by URL and outputs a structured JSON object per input URL.
⚙️ Input
The actor accepts a JSON object with the following fields:
{"urls": ["https://www.apple.com","https://www.nike.com"],"extractDimensions": true,"fetchDimensions": false,"maxConcurrency": 5}Input fieldsurls (array of strings, required)List of website URLs to extract visual assets from.extractDimensions (boolean, default: true)If true, detects image format (SVG/PNG/WebP/etc.) from the URL and enriches output with this information.fetchDimensions (boolean, default: false)If true, downloads each image (under 5 MB) and uses Pillow to read actual width and height.This is slower and uses more resources, but yields precise dimensions.maxConcurrency (integer, default: 5, min: 1, max: 20)Maximum number of URLs processed concurrently.{"url": "https://example.com","totalAssets": 47,"categoryBreakdown": {"logo": 3,"product": 24,"hero": 2,"team": 8,"icon": 6,"social": 2,"other": 2},"assets": [{"src": "https://example.com/images/logo.svg","category": "logo","format": "svg","source": "img_tag","alt": "Company Logo","class": "site-logo","width": "200","height": "60","loading": "lazy"},{"src": "https://example.com/images/hero-banner.webp","category": "hero","format": "webp","source": "css_background","class": "homepage-hero","width": 1920,"height": 1080}]}