AI Image Intelligence
Pricing
from $10.00 / 1,000 results
AI Image Intelligence
Make every image work harder for your business. Auto-generate SEO-optimized metadata, accessibility-compliant alt text, and rich descriptions using AI. Perfect for e-commerce, content sites, and stock agencies processing hundreds of images daily. $0.01/image.
Pricing
from $10.00 / 1,000 results
Rating
0.0
(0)
Developer

Marielise
Actor stats
0
Bookmarked
4
Total users
3
Monthly active users
a day ago
Last modified
Categories
Share
Extract comprehensive SEO metadata, accessibility-compliant alt text, visual analysis, and custom fields from any image using state-of-the-art vision AI models like GPT-4o, Claude, and Gemini.
Why Use This Actor?
Save hours of manual work - Automatically generate SEO-optimized alt text, titles, descriptions, and keywords for thousands of images. One API call extracts 50+ data points including colors, objects, faces, text (OCR), EXIF metadata, and custom fields you define.
Boost your SEO rankings - Search engines can't "see" images. This Actor generates the metadata they need to understand and rank your visual content. Get professionally written alt text, Open Graph descriptions, and keyword-rich titles in any language.
Ensure accessibility compliance - Meet WCAG 2.1 guidelines with AI-generated alt text that accurately describes image content for screen readers. Essential for ADA compliance and inclusive web design.
Flexible model selection - Choose the AI model that fits your needs and budget. Use GPT-4o for highest accuracy, Gemini Flash for speed, or Claude for nuanced descriptions.
Features
- SEO Content Generation - Alt text, titles, descriptions, keywords, Open Graph meta, and SEO-friendly filenames
- Visual Analysis - Dominant colors (hex), brightness, contrast, saturation, sharpness, noise levels
- Object Detection - Subjects, objects, landmarks, logos, and scene classification
- Text Extraction (OCR) - Extract text visible in images
- Face Detection - Detect faces with position, estimated age, gender, and emotion
- EXIF Metadata - Camera, lens, ISO, aperture, shutter speed, GPS coordinates (when available)
- Custom Fields - Define your own extraction schema for domain-specific data
- Multi-language Support - Generate content in any language (en, es, fr, de, ja, zh, etc.)
- Multiple AI Providers - OpenAI GPT-4o, Anthropic Claude, Google Gemini, Groq Llama
What Data You Get
| Category | Fields | Description |
|---|---|---|
| Identification | subjects, category, tags, objects, landmarks, logos | What's in the image |
| SEO | alt, title, description, keywords, ogDescription, suggestedFilename | Search engine optimized content |
| Visual | colors (hex), brightness, contrast, saturation, sharpness, noise | Technical visual metrics |
| Content | text (OCR), faces, watermarks, scene | Content detection |
| File | width, height, aspectRatio, format, sizeBytes, colorSpace | File metadata |
| EXIF | camera, lens, ISO, aperture, shutterSpeed, focalLength, flash, dateTaken, GPS | Camera metadata |
| Custom | user-defined | Your custom extraction fields |
Getting Started
Step 1: Set Up API Keys
This Actor requires an API key from at least one AI provider. Set the environment variable in Actor Settings > Environment Variables:
| Provider | Environment Variable | Get API Key |
|---|---|---|
| OpenAI (default) | OPENAI_API_KEY | platform.openai.com/api-keys |
| Anthropic | ANTHROPIC_API_KEY | console.anthropic.com |
GOOGLE_API_KEY | aistudio.google.com/apikey | |
| Groq | GROQ_API_KEY | console.groq.com |
Step 2: Run the Actor
Minimum input - just provide an image URL:
{"imageUrl": "https://example.com/photo.jpg"}
Or use base64-encoded image data:
{"imageBase64": "/9j/4AAQSkZJRgABAQAAAQABAAD..."}
Input Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
imageUrl | string | One of | - | Public URL of the image to analyze |
imageBase64 | string | One of | - | Base64-encoded image data |
model | string | No | gpt-4o | AI model to use (see Model Selection) |
language | string | No | en | Language for generated text content |
imageContext | string | No | - | Context to guide the analysis |
customSchema | object | No | - | Custom fields to extract |
Model Selection
Built-in Model Aliases
Use these shorthand names for common models:
| Alias | Full Model | Provider | Best For |
|---|---|---|---|
gpt-4o | openai:gpt-4o | OpenAI | Highest accuracy, best overall |
gpt-4o-mini | openai:gpt-4o-mini | OpenAI | Cost-effective, fast |
claude-sonnet | anthropic:claude-sonnet-4-20250514 | Anthropic | Nuanced descriptions |
gemini-flash | google:gemini-2.0-flash | Fastest, budget-friendly | |
gemini-pro | google:gemini-1.5-pro | High quality, longer context |
Using Other Models
Specify any vision-capable model with provider:model format:
{"imageUrl": "https://example.com/photo.jpg","model": "groq:llama-3.2-90b-vision-preview"}
Examples
Basic Analysis (Default Settings)
{"imageUrl": "https://images.unsplash.com/photo-1501854140801-50d01698950b"}
Multi-language SEO (Spanish)
{"imageUrl": "https://example.com/product.jpg","language": "es"}
E-commerce Product Photography
{"imageUrl": "https://example.com/furniture.jpg","language": "en","imageContext": "E-commerce product photography for a modern furniture store. Focus on material quality and design style.","customSchema": {"productType": "","material": "","style": "","color": "","suitableRooms": ""}}
Wildlife Photography with Species Identification
{"imageUrl": "https://example.com/bird.jpg","imageContext": "Nature and wildlife photography. Identify bird species with scientific name if possible.","customSchema": {"species": "","scientificName": "","habitat": "","behavior": "","conservationStatus": ""}}
Real Estate Listing
{"imageUrl": "https://example.com/room.jpg","imageContext": "Real estate listing photography for luxury homes.","customSchema": {"roomType": "","squareFootage": "","features": "","lightingQuality": "","viewDescription": ""}}
Food Photography
{"imageUrl": "https://example.com/dish.jpg","imageContext": "Restaurant menu photography.","customSchema": {"dishName": "","cuisine": "","mainIngredients": "","dietaryInfo": "","presentationStyle": ""}}
Fast Analysis with Gemini
{"imageUrl": "https://example.com/photo.jpg","model": "gemini-flash"}
Output Example
{"identification": {"subjects": ["mountain landscape", "lake"],"category": "nature","tags": ["mountain", "lake", "nature", "landscape", "scenic", "outdoors", "wilderness"],"objects": ["mountain", "lake", "trees", "sky", "clouds"],"landmarks": ["Rocky Mountains"],"logos": []},"seo": {"language": "en","alt": "Majestic mountain landscape with crystal-clear lake reflecting snow-capped peaks","title": "Rocky Mountain Lake Reflection","description": "A stunning natural landscape featuring a pristine mountain lake perfectly reflecting the surrounding snow-capped peaks and evergreen forests.","keywords": ["mountain landscape", "lake reflection", "Rocky Mountains", "nature photography", "scenic view"],"ogDescription": "Discover the breathtaking beauty of Rocky Mountain wilderness with this stunning lake reflection photograph.","suggestedFilename": "rocky-mountain-lake-reflection"},"visual": {"colors": ["#1E90FF", "#228B22", "#FFFFFF", "#87CEEB", "#2F4F4F"],"brightness": 0.65,"contrast": 0.7,"saturation": 0.6,"sharpness": 0.8,"noise": 0.1},"file": {"width": 1920,"height": 1080,"aspectRatio": 1.78,"format": "JPEG","sizeBytes": 0,"colorSpace": "sRGB","orientation": 1},"content": {"text": "","faces": [],"watermark": "","scene": "outdoor landscape"},"exif": {"camera": "","lens": "","iso": 0,"aperture": 0,"shutterSpeed": "","focalLength": 0,"flash": false,"dateTaken": "","gps": { "lat": 0, "lng": 0 }},"customFields": {},"configApplied": {"model": "gpt-4o","language": "en","context": ""},"usage": {"tokens": {"input": 1250,"output": 450,"total": 1700},"cost": 0.008925},"processedAt": "2025-01-15T10:30:00.000Z"}
Use Cases
E-commerce & Product Catalogs
- Generate SEO-optimized product image alt text and descriptions
- Extract product attributes (color, material, style) for filtering
- Create consistent metadata across thousands of product images
- Improve product discovery in image search
Content Management & Publishing
- Automate alt text generation for blog posts and articles
- Generate Open Graph descriptions for social media sharing
- Create keyword-rich image titles for better SEO
- Bulk process image libraries for CMS migration
Accessibility Compliance
- Meet WCAG 2.1 Level AA requirements with accurate alt text
- Ensure ADA compliance for commercial websites
- Create inclusive content for screen reader users
- Audit existing images for accessibility gaps
Stock Photography & Media Libraries
- Auto-tag and categorize large image collections
- Generate searchable metadata for asset management
- Extract technical details for quality filtering
- Create multilingual descriptions for global markets
Real Estate & Property
- Describe room features and amenities automatically
- Extract property characteristics from listing photos
- Generate virtual tour descriptions
- Create multilingual listings for international buyers
Wildlife & Nature Photography
- Identify species with scientific names
- Extract habitat and behavior information
- Generate educational content for nature databases
- Support conservation research with automated tagging
Pricing
This Actor uses Pay Per Event (PPE) pricing:
- $0.01 per image analyzed
- No monthly fees or minimums
- Pay only for what you use
- AI API costs are included in the price
Technical Specifications
- Supported formats: JPEG, PNG, GIF, BMP, WEBP
- Maximum image size: 20MB
- Processing time: Typically 2-10 seconds per image
- Memory requirement: 256MB - 4096MB (auto-scaled)
- API timeout: 5 minutes maximum
Troubleshooting
"Missing API key" Error
Set the required environment variable in Actor Settings > Environment Variables. For the default model (gpt-4o), you need OPENAI_API_KEY.
"Could not fetch image" Error
Ensure the image URL is:
- Publicly accessible (no authentication required)
- Using HTTPS protocol
- Not blocked by CORS or firewall
Slow Processing
- Use
gemini-flashmodel for faster results - Ensure images are reasonably sized (< 5MB for optimal speed)
- Check your AI provider's rate limits
Unexpected Results
- Add
imageContextto guide the analysis - Use domain-specific context for better accuracy
- Try a different model for varied perspectives
FAQ
Q: Which model should I use?
A: Start with gpt-4o (default) for best accuracy. Use gemini-flash for speed and cost savings. Use claude-sonnet for nuanced, creative descriptions.
Q: Can I process multiple images? A: Yes! Create multiple Actor runs in parallel, or use the Apify API to batch process images programmatically.
Q: What languages are supported? A: Any language supported by the underlying AI model. Common languages: English (en), Spanish (es), French (fr), German (de), Japanese (ja), Chinese (zh), Portuguese (pt), Italian (it), Korean (ko), Arabic (ar).
Q: Are my images stored? A: No. Images are processed in memory and not stored. Only the analysis results are saved to your dataset.
Q: Can I use my own AI API keys? A: Yes! Set your API keys in the Actor's environment variables to use your own accounts and quotas.
Support
- Documentation: Apify Actor Documentation
- Issues: Report bugs or request features via Apify Console
- Updates: Follow the Actor for version updates and improvements
Changelog
v1.0.0
- Initial release
- Support for OpenAI, Anthropic, Google, and Groq providers
- Comprehensive image analysis with 50+ output fields
- Custom field extraction support
- Multi-language output
- PPE pricing model

