Image To Image Localization Actor avatar
Image To Image Localization Actor

Pricing

from $0.01 / 1,000 results

Go to Apify Store
Image To Image Localization Actor

Image To Image Localization Actor

Image to Image Text Translation Actor Translate text within images while preserving the original layout, styling, and visual appearance. This Actor uses Google Cloud Vision API for text detection and Lingo.dev or Gemini AI for high-quality translation.

Pricing

from $0.01 / 1,000 results

Rating

0.0

(0)

Developer

Agung Sidharta So

Agung Sidharta So

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

an hour ago

Last modified

Share

Image to Image Text Translation Actor

Translate text within images while preserving the original layout, styling, and visual appearance. This Actor uses Google Cloud Vision API for text detection and Lingo.dev for high-quality translation with brand voice preservation.

Features

  • 🔍 Smart Text Detection: Uses Google Cloud Vision API to detect text with precise bounding boxes
  • 🌍 70+ Languages: Supports all major languages with regional variants
  • 🎨 Visual Preservation: Maintains original colors, fonts, and layout
  • 📏 Adaptive Sizing: Automatically adjusts font size and background for different text lengths
  • 🔤 Alphanumeric Filtering: Only translates readable text (letters, numbers, spaces)
  • 📁 Batch Processing: Process up to 100 images in a single run
  • 🔄 Dual Translation Options: Choose between Gemini AI (default, direct image translation) or Lingo.dev (brand voice preservation)

How It Works

  1. Text Detection: Scans the image using Google Cloud Vision API (Lingo.dev mode only)
  2. Smart Filtering: Only processes alphanumeric text - skips symbols, decorations (Lingo.dev mode only)
  3. Translation: Choose between Gemini AI (default, direct image translation) or Lingo.dev (brand voice preservation with text detection)
  4. Color Analysis: Detects original text colors and background patterns (Lingo.dev mode only)
  5. Adaptive Rendering: Adjusts font size and background based on translation length (Lingo.dev mode only)
  6. AI Quality Check: Gemini analyzes result for text overlaps and applies fixes (Lingo.dev mode with GEMINI_API_KEY)
  7. Image Composition: Overlays translated text while preserving visual integrity (Lingo.dev mode)

Input

FieldTypeRequiredDescription
imageUrlsArrayYesArray of image URLs to translate (up to 100 images per run)
targetLanguageStringYesTarget language (dropdown selection)
translationProviderStringNoTranslation service: Gemini AI (default) or Lingo.dev (brand voice)
fontSizeIntegerNoOverride font size (8-72px, Lingo.dev only, default: auto-detect)
fontFamilyStringNoFont family (Lingo.dev only: Arial, Helvetica, Times New Roman, Courier New)

Supported Languages

The Actor supports 70+ languages including:

  • Major Languages: English, Spanish, French, German, Chinese, Japanese, Arabic, Russian
  • Regional Variants: en-US, en-GB, es-ES, es-MX, zh-CN, zh-TW, etc.
  • Specialized: Bavarian, Neapolitan, Tamazight, and more

Environment Variables

Set these as secret environment variables in the Apify Console:

VariableRequiredDescription
GEMINI_API_KEYConditional*Gemini API key (required if using Gemini provider)
LINGO_API_KEYConditional*Your Lingo.dev API key (required if using Lingo.dev provider)
GOOGLE_CLOUD_CREDENTIALS_JSONConditional**Google Cloud credentials as JSON string (required for Lingo.dev provider)

*Required based on selected translation provider (Gemini is default) **Only required when using Lingo.dev provider

Setup Instructions

1. Get API Keys

Gemini API:

  1. Get API key from Google AI Studio
  2. Set as GEMINI_API_KEY environment variable

Lingo.dev API Key (Optional):

  1. Sign up at Lingo.dev
  2. Create a new project
  3. Generate an API key
  4. Set as LINGO_API_KEY environment variable

Google Cloud Vision API (Optional - for Lingo.dev provider):

  1. Create a project in Google Cloud Console
  2. Enable the Vision API
  3. Create a service account with Vision API permissions
  4. Download the JSON key file
  5. Set as GOOGLE_CLOUD_CREDENTIALS_JSON environment variable

Gemini API (Optional - for Lingo.dev AI enhancement):

  1. Get API key from Google AI Studio
  2. Set as GEMINI_API_KEY environment variable for AI image editing with Lingo.dev

2. Configure Environment Variables

In the Apify Console:

  1. Go to your Actor settings
  2. Add environment variables based on your chosen provider:
    • For Gemini (default): GEMINI_API_KEY
    • For Lingo.dev: LINGO_API_KEY and GOOGLE_CLOUD_CREDENTIALS_JSON
    • Optional: Add GEMINI_API_KEY when using Lingo.dev for AI-powered text overlap fixes

3. Run the Actor

  1. Select target language from dropdown
  2. Choose translation provider (Gemini AI is default, or Lingo.dev for brand voice)
  3. Provide one or more image URLs (up to 100)
  4. Optionally adjust font settings (Lingo.dev only)
  5. Run the Actor

Output

The Actor produces:

Dataset Items

{
"originalUrl": "https://example.com/image.jpg",
"originalType": "url",
"originalFileName": null,
"translatedImageUrl": "https://api.apify.com/v2/key-value-stores/.../translated-image.png",
"targetLanguage": "es-ES",
"processingTime": 3.2
}

Key-Value Store

  • Translated Image: PNG file with translated text overlaid

Use Cases

  • Marketing Localization: Translate ads, banners, and promotional materials
  • E-commerce: Localize product images for international markets
  • Documentation: Translate screenshots and diagrams
  • Social Media: Adapt visual content for different regions
  • Website Localization: Translate UI elements and graphics

Limitations

  • Only processes alphanumeric text (letters, numbers, spaces)
  • Requires clear, readable text in images
  • Works best with horizontal text layouts
  • Very complex backgrounds may affect color detection

Performance

  • Processing Time: 5-8 seconds per image with Gemini, 8-12 seconds with Lingo.dev (including AI analysis)
  • Batch Processing: Up to 100 images per run
  • Supported Formats: JPEG, PNG, WebP, GIF
  • Max Image Size: 10MB
  • Concurrent Runs: Up to 100 (depending on plan)

Error Handling

Common errors and solutions:

ErrorCauseSolution
"No text detected"Image has no readable textUse images with clear text
"LINGO_API_KEY required"Missing API keySet environment variable
"Google Cloud credentials not found"Missing credentialsSet Google Cloud environment variable
"No alphanumeric text found"Only symbols/decorations detectedUse images with letters/numbers

Example Usage

Via Apify Console

  1. Open the Actor in Apify Console
  2. Add one or more image URLs
  3. Select "Spanish (Spain)" as target language
  4. Choose translation provider (Gemini AI is default)
  5. Click "Start"

Via API

const { ApifyApi } = require('apify-client');
const client = new ApifyApi({
token: 'your-apify-token'
});
const run = await client.actor('your-actor-id').call({
imageUrls: [
'https://example.com/image1.jpg',
'https://example.com/image2.jpg'
],
targetLanguage: 'es-ES',
translationProvider: 'gemini' // default, or 'lingo'
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item, i) => {
console.log(`Image ${i + 1}:`, item.translatedImageUrl);
});

Support

License

This Actor is licensed under the Apache 2.0 License.