Broken Image Checker
Pricing
from $10.00 / 1,000 url checkeds
Broken Image Checker
Detect broken or missing images on any public webpage and get a clean, actionable report. Perfect for SEO professionals, webmasters, QA testers, and UX teams.
Pricing
from $10.00 / 1,000 url checkeds
Rating
0.0
(0)
Developer

Mark Peterson
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
18 hours ago
Last modified
Categories
Share
Broken Image Checker - Find Missing & Broken Images Across Multiple Pages
Detect broken or missing images across multiple webpages in a single run and get clean, actionable reports. Perfect for SEO professionals, webmasters, QA testers, and UX teams running site-wide audits.
Key Benefits:
- Batch processing - Check hundreds of pages in one run
- Fast detection using HEAD requests (10x faster than GET)
- Sitemap integration - Works seamlessly with Sitemap Fetcher output
- Accurate status codes and error messages
- Aggregated reports with per-page breakdowns
- Proxy support for geo-restricted content
Why use this actor?
Broken images hurt your SEO rankings, user experience, and brand credibility. Manual checking is time-consuming and error-prone, especially on large websites with hundreds of pages and thousands of images.
Problems this solves:
- Site-wide SEO audits - Scan your entire website to identify broken images that harm search rankings and waste crawl budget
- Batch QA testing - Check hundreds of pages before production deployment to catch missing images from CMS migrations
- Site health monitoring - Monitor multiple pages simultaneously to detect CDN failures and broken external image links
- UX optimization at scale - Find loading errors across your entire site that frustrate users and hurt conversions
Features
- Batch processing - Check images across multiple pages in a single run (up to 10,000 pages)
- Sitemap integration - Connect directly to Sitemap Fetcher output via dataset
- Fast parallel checking - All images on each page checked simultaneously
- HEAD requests first - 10x faster than GET, with automatic GET fallback
- Sequential page processing - Prevents memory overload on large batches
- Accurate error detection - HTTP status codes and detailed error messages
- Aggregated reporting - Per-page breakdowns plus summary statistics
- Graceful error handling - Continues processing even if individual pages fail
- Handles relative and absolute URLs - Automatically converts to absolute
- Proxy support - Works with Apify Proxy or custom proxies
- Progress tracking - Real-time logging of page processing status
How it works
This actor processes your batch of URLs in the following steps:
- Load URLs: Reads your list of URLs (manual input, file upload, or dataset from Sitemap Fetcher)
- Process sequentially: Checks each page one at a time to avoid memory issues (respects maxPages limit)
- Fetch HTML: Downloads the webpage HTML from each URL
- Parse images: Extracts all
<img>tags and converts relative URLs to absolute - Check availability: Tests each image with HEAD request (faster), falls back to GET if needed
- Detect errors: Identifies broken images by HTTP status codes (404, 500, etc.) or request failures
- Aggregate results: Combines per-page results with summary statistics (total broken images, pages affected)
- Output report: Returns a comprehensive JSON report with per-page breakdowns and totals
Input
{"startUrls": [{ "url": "https://example.com" },{ "url": "https://example.com/products" },{ "url": "https://example.com/about" }],"maxPages": 100,"timeoutMs": 8000,"debugLog": false}
Input parameters
| Field | Type | Description | Required | Default |
|---|---|---|---|---|
startUrls | array | List of webpage URLs to scan for broken images. Supports manual entry, file upload, or dataset integration | Yes | - |
maxPages | integer | Maximum number of pages to check (1-10,000). Controls cost and runtime | No | 100 |
timeoutMs | integer | Timeout for HTTP requests in milliseconds (1000-30000) | No | 8000 |
proxyConfiguration | object | Proxy settings for requests (Apify Proxy or custom) | No | - |
debugLog | boolean | Enable detailed logging for troubleshooting | No | false |
Connecting to Sitemap Fetcher
You can pipe URLs directly from the Sitemap Fetcher actor:
- Run the Sitemap Fetcher actor to extract URLs from your sitemap
- In this actor's input, use the requestListSources editor
- Connect the Sitemap Fetcher's dataset as the source
- The actor will automatically extract URLs from the dataset
Output
The actor stores aggregated results in the default dataset:
{"pagesChecked": 3,"totalImages": 87,"brokenImagesByPage": [{"pageUrl": "https://example.com","imageCount": 25,"brokenImages": [],"checkedAt": "2025-12-13T10:30:00.000Z","error": null},{"pageUrl": "https://example.com/products","imageCount": 42,"brokenImages": [{"src": "https://example.com/missing.jpg","status": 404,"error": null},{"src": "https://cdn.example.com/timeout.png","status": null,"error": "Request timeout"}],"checkedAt": "2025-12-13T10:30:15.000Z","error": null},{"pageUrl": "https://example.com/about","imageCount": 20,"brokenImages": [],"checkedAt": "2025-12-13T10:30:22.000Z","error": null}],"summary": {"totalBrokenImages": 2,"pagesWithBrokenImages": 1},"checkedAt": "2025-12-13T10:30:22.000Z"}
Output fields
| Field | Type | Description |
|---|---|---|
pagesChecked | integer | Total number of pages successfully checked |
totalImages | integer | Total images found across all pages |
brokenImagesByPage | array | Per-page results with broken images |
brokenImagesByPage[].pageUrl | string | The webpage URL that was scanned |
brokenImagesByPage[].imageCount | integer | Number of images found on this page |
brokenImagesByPage[].brokenImages | array | List of broken images on this page |
brokenImagesByPage[].brokenImages[].src | string | URL of the broken image |
brokenImagesByPage[].brokenImages[].status | integer/null | HTTP status code (404, 500, etc.) or null if request failed |
brokenImagesByPage[].brokenImages[].error | string/null | Error message if the request failed |
brokenImagesByPage[].checkedAt | string | ISO 8601 timestamp when this page was checked |
brokenImagesByPage[].error | string/null | Error message if the page failed to load |
summary | object | Aggregated statistics across all pages |
summary.totalBrokenImages | integer | Total number of broken images found |
summary.pagesWithBrokenImages | integer | Number of pages that have at least one broken image |
checkedAt | string | ISO 8601 timestamp when the run completed |
Use cases
This actor is perfect for:
- Site-wide SEO audits: Combine with Sitemap Fetcher to scan your entire website (hundreds or thousands of pages) to find broken images that hurt search rankings and waste crawl budget. Get a complete report showing which pages have issues.
- Pre-launch QA testing: Batch check all staging environment pages before deployment to catch missing images from CMS migrations, broken CDN links, or incorrect image paths across your entire site.
- Site health monitoring: Set up scheduled runs with your sitemap to continuously monitor hundreds of pages simultaneously, detecting CDN failures, expired external image links, or accidental deletions in real-time.
- CRO/UX optimization at scale: Identify image loading errors across your entire site that frustrate users, increase bounce rates, and hurt conversion rates. Get summary statistics to prioritize fixes.
- Client reporting: Generate comprehensive reports showing broken images across multiple client websites or sections, with aggregated statistics perfect for client presentations.
Proxy configuration
This actor supports both Apify Proxy and custom HTTP/HTTPS/SOCKS proxies.
Using Apify Proxy (Recommended)
{"proxyConfiguration": {"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"],"apifyProxyCountry": "US"}}
Using custom proxies
{"proxyConfiguration": {"proxyUrls": ["http://proxy.example.com:8000"]}}
Proxies are useful when checking geo-restricted content or avoiding rate limits on high-traffic sites.
⚙️ Performance
- Typical runtime: 5–10 seconds per page for pages with ~50 images
- Batch processing: Checks pages sequentially to avoid memory issues
- Runs efficiently across hundreds or thousands of pages
- Actual performance varies based on:
- Number of pages in your batch (controlled by
maxPages) - Number of images per page
- Image server/CDN response times
- Network latency to target servers
- Timeout settings
- Proxy configuration used
- Number of pages in your batch (controlled by
Tip: Start with maxPages: 10 to test your URLs, then scale to 100, 1000, or more as needed.
Error handling
This actor includes robust error handling:
- Page-level resilience: If one page fails to load, the actor continues processing remaining pages
- Automatic retries: Failed requests are retried with exponential backoff
- HEAD/GET fallback: If HEAD requests fail, the actor automatically tries GET requests
- Detailed logging: All errors are logged with context, including page progress ("Processing page 5 of 100")
- Error reporting: Failed pages are included in output with error details for troubleshooting
- Graceful failure: Successfully processed pages are reported even if some pages fail
- Timeout handling: Configurable timeouts prevent hanging on slow servers
- URL validation: Invalid URLs are logged and skipped rather than crashing the run
Limitations
- Only checks publicly accessible webpages (no authentication support)
- Maximum 10,000 pages per run (controlled by
maxPages) - Maximum timeout of 30 seconds per image request
- Sequential page processing (not parallel) to avoid memory issues
- JavaScript-rendered images require the page to already have rendered HTML (consider using Playwright for dynamic sites)
- Does not follow pagination or crawl dynamically - provide all URLs via
startUrlsor connect a dataset
Tips for best results
- Start small, then scale: Test with
maxPages: 10first to verify your URLs work, then increase to 100, 1000, etc. - Combine with Sitemap Fetcher: Use the Sitemap Fetcher actor first to get all your site URLs, then connect its dataset to this actor for comprehensive coverage
- Use appropriate timeouts: Increase
timeoutMsto 15000+ for slow CDNs or international servers - Enable debug logging: Set
debugLog: truewhen troubleshooting to see detailed per-page progress - Use proxies for geo-content: Some CDNs serve different images based on location - use residential proxies to test from specific countries
- Monitor summary statistics: Check
summary.totalBrokenImagesandsummary.pagesWithBrokenImagesfor quick insights before diving into per-page details - Schedule regular runs: Set up scheduled runs to monitor site health continuously across all your important pages
Related actors
Check out these related actors for comprehensive site auditing:
- URL Canonicalizer + Redirect Resolver: Check for redirect chains and canonical URL issues
- Sitemap Fetcher + Page Title Extractor: Analyze your sitemap and page metadata
- URL Metadata Extractor: Extract Open Graph images and metadata from multiple pages
Support & feedback
Need help or have suggestions?
- Issues: Create an issue in the GitHub repository
- Email: Contact through Apify platform messaging
Changelog
Version 1.0.9 (2025-12-13)
- Batch processing support - Check multiple pages in a single run (up to 10,000 pages)
- Sitemap integration - requestListSources editor supports dataset connections
- Aggregated reporting - Per-page results with summary statistics
- Graceful error handling - Continues processing even if individual pages fail
- Progress logging - Real-time page processing status
- Added
startUrlsarray input (replaces singleurl) - Added
maxPageslimit for cost control
Version 1.0.0 (2025-12-12)
- Initial release
- Fast parallel image checking with HEAD/GET fallback
- Support for relative and absolute URLs
- Proxy configuration support
- Comprehensive error reporting
Made with care for the web development community
Part of the Apify Actor Portfolio collection