AI Image Intelligence avatar
AI Image Intelligence

Pricing

from $10.00 / 1,000 results

Go to Apify Store
AI Image Intelligence

AI Image Intelligence

Make every image work harder for your business. Auto-generate SEO-optimized metadata, accessibility-compliant alt text, and rich descriptions using AI. Perfect for e-commerce, content sites, and stock agencies processing hundreds of images daily. $0.01/image.

Pricing

from $10.00 / 1,000 results

Rating

0.0

(0)

Developer

Marielise

Marielise

Maintained by Community

Actor stats

0

Bookmarked

4

Total users

3

Monthly active users

a day ago

Last modified

Share

Extract comprehensive SEO metadata, accessibility-compliant alt text, visual analysis, and custom fields from any image using state-of-the-art vision AI models like GPT-4o, Claude, and Gemini.

Why Use This Actor?

Save hours of manual work - Automatically generate SEO-optimized alt text, titles, descriptions, and keywords for thousands of images. One API call extracts 50+ data points including colors, objects, faces, text (OCR), EXIF metadata, and custom fields you define.

Boost your SEO rankings - Search engines can't "see" images. This Actor generates the metadata they need to understand and rank your visual content. Get professionally written alt text, Open Graph descriptions, and keyword-rich titles in any language.

Ensure accessibility compliance - Meet WCAG 2.1 guidelines with AI-generated alt text that accurately describes image content for screen readers. Essential for ADA compliance and inclusive web design.

Flexible model selection - Choose the AI model that fits your needs and budget. Use GPT-4o for highest accuracy, Gemini Flash for speed, or Claude for nuanced descriptions.

Features

  • SEO Content Generation - Alt text, titles, descriptions, keywords, Open Graph meta, and SEO-friendly filenames
  • Visual Analysis - Dominant colors (hex), brightness, contrast, saturation, sharpness, noise levels
  • Object Detection - Subjects, objects, landmarks, logos, and scene classification
  • Text Extraction (OCR) - Extract text visible in images
  • Face Detection - Detect faces with position, estimated age, gender, and emotion
  • EXIF Metadata - Camera, lens, ISO, aperture, shutter speed, GPS coordinates (when available)
  • Custom Fields - Define your own extraction schema for domain-specific data
  • Multi-language Support - Generate content in any language (en, es, fr, de, ja, zh, etc.)
  • Multiple AI Providers - OpenAI GPT-4o, Anthropic Claude, Google Gemini, Groq Llama

What Data You Get

CategoryFieldsDescription
Identificationsubjects, category, tags, objects, landmarks, logosWhat's in the image
SEOalt, title, description, keywords, ogDescription, suggestedFilenameSearch engine optimized content
Visualcolors (hex), brightness, contrast, saturation, sharpness, noiseTechnical visual metrics
Contenttext (OCR), faces, watermarks, sceneContent detection
Filewidth, height, aspectRatio, format, sizeBytes, colorSpaceFile metadata
EXIFcamera, lens, ISO, aperture, shutterSpeed, focalLength, flash, dateTaken, GPSCamera metadata
Customuser-definedYour custom extraction fields

Getting Started

Step 1: Set Up API Keys

This Actor requires an API key from at least one AI provider. Set the environment variable in Actor Settings > Environment Variables:

ProviderEnvironment VariableGet API Key
OpenAI (default)OPENAI_API_KEYplatform.openai.com/api-keys
AnthropicANTHROPIC_API_KEYconsole.anthropic.com
GoogleGOOGLE_API_KEYaistudio.google.com/apikey
GroqGROQ_API_KEYconsole.groq.com

Step 2: Run the Actor

Minimum input - just provide an image URL:

{
"imageUrl": "https://example.com/photo.jpg"
}

Or use base64-encoded image data:

{
"imageBase64": "/9j/4AAQSkZJRgABAQAAAQABAAD..."
}

Input Parameters

ParameterTypeRequiredDefaultDescription
imageUrlstringOne of-Public URL of the image to analyze
imageBase64stringOne of-Base64-encoded image data
modelstringNogpt-4oAI model to use (see Model Selection)
languagestringNoenLanguage for generated text content
imageContextstringNo-Context to guide the analysis
customSchemaobjectNo-Custom fields to extract

Model Selection

Built-in Model Aliases

Use these shorthand names for common models:

AliasFull ModelProviderBest For
gpt-4oopenai:gpt-4oOpenAIHighest accuracy, best overall
gpt-4o-miniopenai:gpt-4o-miniOpenAICost-effective, fast
claude-sonnetanthropic:claude-sonnet-4-20250514AnthropicNuanced descriptions
gemini-flashgoogle:gemini-2.0-flashGoogleFastest, budget-friendly
gemini-progoogle:gemini-1.5-proGoogleHigh quality, longer context

Using Other Models

Specify any vision-capable model with provider:model format:

{
"imageUrl": "https://example.com/photo.jpg",
"model": "groq:llama-3.2-90b-vision-preview"
}

Examples

Basic Analysis (Default Settings)

{
"imageUrl": "https://images.unsplash.com/photo-1501854140801-50d01698950b"
}

Multi-language SEO (Spanish)

{
"imageUrl": "https://example.com/product.jpg",
"language": "es"
}

E-commerce Product Photography

{
"imageUrl": "https://example.com/furniture.jpg",
"language": "en",
"imageContext": "E-commerce product photography for a modern furniture store. Focus on material quality and design style.",
"customSchema": {
"productType": "",
"material": "",
"style": "",
"color": "",
"suitableRooms": ""
}
}

Wildlife Photography with Species Identification

{
"imageUrl": "https://example.com/bird.jpg",
"imageContext": "Nature and wildlife photography. Identify bird species with scientific name if possible.",
"customSchema": {
"species": "",
"scientificName": "",
"habitat": "",
"behavior": "",
"conservationStatus": ""
}
}

Real Estate Listing

{
"imageUrl": "https://example.com/room.jpg",
"imageContext": "Real estate listing photography for luxury homes.",
"customSchema": {
"roomType": "",
"squareFootage": "",
"features": "",
"lightingQuality": "",
"viewDescription": ""
}
}

Food Photography

{
"imageUrl": "https://example.com/dish.jpg",
"imageContext": "Restaurant menu photography.",
"customSchema": {
"dishName": "",
"cuisine": "",
"mainIngredients": "",
"dietaryInfo": "",
"presentationStyle": ""
}
}

Fast Analysis with Gemini

{
"imageUrl": "https://example.com/photo.jpg",
"model": "gemini-flash"
}

Output Example

{
"identification": {
"subjects": ["mountain landscape", "lake"],
"category": "nature",
"tags": ["mountain", "lake", "nature", "landscape", "scenic", "outdoors", "wilderness"],
"objects": ["mountain", "lake", "trees", "sky", "clouds"],
"landmarks": ["Rocky Mountains"],
"logos": []
},
"seo": {
"language": "en",
"alt": "Majestic mountain landscape with crystal-clear lake reflecting snow-capped peaks",
"title": "Rocky Mountain Lake Reflection",
"description": "A stunning natural landscape featuring a pristine mountain lake perfectly reflecting the surrounding snow-capped peaks and evergreen forests.",
"keywords": ["mountain landscape", "lake reflection", "Rocky Mountains", "nature photography", "scenic view"],
"ogDescription": "Discover the breathtaking beauty of Rocky Mountain wilderness with this stunning lake reflection photograph.",
"suggestedFilename": "rocky-mountain-lake-reflection"
},
"visual": {
"colors": ["#1E90FF", "#228B22", "#FFFFFF", "#87CEEB", "#2F4F4F"],
"brightness": 0.65,
"contrast": 0.7,
"saturation": 0.6,
"sharpness": 0.8,
"noise": 0.1
},
"file": {
"width": 1920,
"height": 1080,
"aspectRatio": 1.78,
"format": "JPEG",
"sizeBytes": 0,
"colorSpace": "sRGB",
"orientation": 1
},
"content": {
"text": "",
"faces": [],
"watermark": "",
"scene": "outdoor landscape"
},
"exif": {
"camera": "",
"lens": "",
"iso": 0,
"aperture": 0,
"shutterSpeed": "",
"focalLength": 0,
"flash": false,
"dateTaken": "",
"gps": { "lat": 0, "lng": 0 }
},
"customFields": {},
"configApplied": {
"model": "gpt-4o",
"language": "en",
"context": ""
},
"usage": {
"tokens": {
"input": 1250,
"output": 450,
"total": 1700
},
"cost": 0.008925
},
"processedAt": "2025-01-15T10:30:00.000Z"
}

Use Cases

E-commerce & Product Catalogs

  • Generate SEO-optimized product image alt text and descriptions
  • Extract product attributes (color, material, style) for filtering
  • Create consistent metadata across thousands of product images
  • Improve product discovery in image search

Content Management & Publishing

  • Automate alt text generation for blog posts and articles
  • Generate Open Graph descriptions for social media sharing
  • Create keyword-rich image titles for better SEO
  • Bulk process image libraries for CMS migration

Accessibility Compliance

  • Meet WCAG 2.1 Level AA requirements with accurate alt text
  • Ensure ADA compliance for commercial websites
  • Create inclusive content for screen reader users
  • Audit existing images for accessibility gaps

Stock Photography & Media Libraries

  • Auto-tag and categorize large image collections
  • Generate searchable metadata for asset management
  • Extract technical details for quality filtering
  • Create multilingual descriptions for global markets

Real Estate & Property

  • Describe room features and amenities automatically
  • Extract property characteristics from listing photos
  • Generate virtual tour descriptions
  • Create multilingual listings for international buyers

Wildlife & Nature Photography

  • Identify species with scientific names
  • Extract habitat and behavior information
  • Generate educational content for nature databases
  • Support conservation research with automated tagging

Pricing

This Actor uses Pay Per Event (PPE) pricing:

  • $0.01 per image analyzed
  • No monthly fees or minimums
  • Pay only for what you use
  • AI API costs are included in the price

Technical Specifications

  • Supported formats: JPEG, PNG, GIF, BMP, WEBP
  • Maximum image size: 20MB
  • Processing time: Typically 2-10 seconds per image
  • Memory requirement: 256MB - 4096MB (auto-scaled)
  • API timeout: 5 minutes maximum

Troubleshooting

"Missing API key" Error

Set the required environment variable in Actor Settings > Environment Variables. For the default model (gpt-4o), you need OPENAI_API_KEY.

"Could not fetch image" Error

Ensure the image URL is:

  • Publicly accessible (no authentication required)
  • Using HTTPS protocol
  • Not blocked by CORS or firewall

Slow Processing

  • Use gemini-flash model for faster results
  • Ensure images are reasonably sized (< 5MB for optimal speed)
  • Check your AI provider's rate limits

Unexpected Results

  • Add imageContext to guide the analysis
  • Use domain-specific context for better accuracy
  • Try a different model for varied perspectives

FAQ

Q: Which model should I use? A: Start with gpt-4o (default) for best accuracy. Use gemini-flash for speed and cost savings. Use claude-sonnet for nuanced, creative descriptions.

Q: Can I process multiple images? A: Yes! Create multiple Actor runs in parallel, or use the Apify API to batch process images programmatically.

Q: What languages are supported? A: Any language supported by the underlying AI model. Common languages: English (en), Spanish (es), French (fr), German (de), Japanese (ja), Chinese (zh), Portuguese (pt), Italian (it), Korean (ko), Arabic (ar).

Q: Are my images stored? A: No. Images are processed in memory and not stored. Only the analysis results are saved to your dataset.

Q: Can I use my own AI API keys? A: Yes! Set your API keys in the Actor's environment variables to use your own accounts and quotas.

Support

  • Documentation: Apify Actor Documentation
  • Issues: Report bugs or request features via Apify Console
  • Updates: Follow the Actor for version updates and improvements

Changelog

v1.0.0

  • Initial release
  • Support for OpenAI, Anthropic, Google, and Groq providers
  • Comprehensive image analysis with 50+ output fields
  • Custom field extraction support
  • Multi-language output
  • PPE pricing model