Hugging Face Image AI avatar
Hugging Face Image AI

Pricing

from $0.01 / 1,000 results

Go to Apify Store
Hugging Face Image AI

Hugging Face Image AI

Image processing w/Hugging Face models Text-to-Image: Stable Diffusion, SDXL, DALL-E generation Image-to-Image: Transform images Inpainting: Edit parts of images Classification: Identify objects Object Detection: Locate label objects Segmentation: Pixel analysis Captioning: Generate descriptions

Pricing

from $0.01 / 1,000 results

Rating

0.0

(0)

Developer

John Rippy

John Rippy

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

6 days ago

Last modified

Share

Hugging Face Image - AI Image Processing with Stable Diffusion & Vision Models

Generate, transform, analyze, and understand images with state-of-the-art AI models. Stable Diffusion XL for generation, BLIP for captioning, ViT for classification, DETR for object detection, and more. No GPU required - access the best vision models through API. BYOK with your Hugging Face API token.

Features

  • Text-to-Image - Generate images with Stable Diffusion XL
  • Image-to-Image - Transform images with prompts
  • Inpainting - Edit specific parts of images
  • Image Captioning - Generate descriptions with BLIP
  • Image Classification - Identify objects with Vision Transformer
  • Object Detection - Locate objects with DETR
  • Image Segmentation - Pixel-level scene analysis
  • Zero-Shot Classification - Classify images without training
  • Depth Estimation - Generate depth maps
  • Image Upscaling - 4x super-resolution
  • Demo Mode - Test with sample data before going live

Who Should Use This Actor?

Marketing Teams

Generate product images. Create social media visuals. Caption images for SEO. A/B test creative variants.

E-commerce Businesses

Generate product photos. Remove backgrounds. Upscale images. Auto-caption for accessibility.

Content Creators

Generate blog illustrations. Transform stock photos. Create custom visuals without designers.

Developers

Add AI image features to apps. Build image analysis pipelines. Integrate vision AI without infrastructure.

Social Media Managers

Generate post images. Analyze image content. Auto-tag visual content.

Research Teams

Analyze image datasets. Generate synthetic data. Extract visual features.

Quick Start

Demo Mode (Free Test)

{
"task": "text_to_image",
"prompt": "Professional product photo of sleek wireless earbuds on white background",
"demoMode": true
}

Text-to-Image (Stable Diffusion)

{
"task": "text_to_image",
"apiToken": "hf_your_token_here",
"model": "stabilityai/stable-diffusion-xl-base-1.0",
"prompt": "Modern minimalist office interior, natural lighting, 4k, professional photography",
"negativePrompt": "blurry, low quality, distorted",
"width": 1024,
"height": 1024,
"guidanceScale": 7.5,
"numInferenceSteps": 50,
"demoMode": false
}

Image-to-Image Transformation

{
"task": "image_to_image",
"apiToken": "hf_your_token_here",
"imageUrl": "https://example.com/photo.jpg",
"prompt": "Transform into watercolor painting style",
"strength": 0.8,
"guidanceScale": 7.5,
"demoMode": false
}

Inpainting (Edit Parts of Images)

{
"task": "inpainting",
"apiToken": "hf_your_token_here",
"imageUrl": "https://example.com/original.jpg",
"maskUrl": "https://example.com/mask.png",
"prompt": "A red sports car",
"guidanceScale": 7.5,
"demoMode": false
}

Image Captioning

{
"task": "image_to_text",
"apiToken": "hf_your_token_here",
"imageUrl": "https://example.com/product-photo.jpg",
"demoMode": false
}

Image Classification

{
"task": "image_classification",
"apiToken": "hf_your_token_here",
"imageUrl": "https://example.com/photo.jpg",
"demoMode": false
}

Object Detection

{
"task": "object_detection",
"apiToken": "hf_your_token_here",
"imageUrl": "https://example.com/street-scene.jpg",
"demoMode": false
}

Zero-Shot Image Classification

{
"task": "zero_shot_image_classification",
"apiToken": "hf_your_token_here",
"imageUrl": "https://example.com/photo.jpg",
"candidateLabels": "product photo,lifestyle image,infographic,screenshot",
"demoMode": false
}

Depth Estimation

{
"task": "depth_estimation",
"apiToken": "hf_your_token_here",
"imageUrl": "https://example.com/scene.jpg",
"demoMode": false
}

Image Upscaling (4x)

{
"task": "image_upscaling",
"apiToken": "hf_your_token_here",
"imageUrl": "https://example.com/low-res.jpg",
"prompt": "high quality, detailed, sharp",
"demoMode": false
}

Input Parameters

ParameterTypeDefaultDescription
taskstringrequiredTask to perform (see task list)
apiTokenstring-Your Hugging Face API token
modelstringtask defaultSpecific model to use
promptstring-Text prompt for generation
negativePromptstring-What to avoid in generation
imageUrlstring-Input image URL
maskUrlstring-Mask image URL (for inpainting)
candidateLabelsstring-Comma-separated labels (zero-shot)
widthnumber1024Output image width
heightnumber1024Output image height
guidanceScalenumber7.5CFG scale (prompt adherence)
numInferenceStepsnumber50Diffusion steps
strengthnumber0.8Transformation strength (0-1)
seednumber-Random seed for reproducibility
schedulerstring"default"Diffusion scheduler
waitForModelbooleantrueWait for model to load
webhookUrlstring-Webhook URL for results
demoModebooleantrueReturn sample data

Available Tasks

TaskDescriptionDefault Model
text_to_imageGenerate images from textStable Diffusion XL
image_to_imageTransform imagesSDXL Refiner
inpaintingEdit image regionsStable Diffusion XL
image_to_textGenerate captionsBLIP-large
image_classificationClassify imagesViT-base-patch16
object_detectionDetect objectsDETR-ResNet-50
image_segmentationSegment scenesDETR-panoptic
zero_shot_image_classificationClassify with custom labelsCLIP-ViT-large
depth_estimationGenerate depth mapsDPT-large
image_upscaling4x super-resolutionSD-x4-upscaler

Output Format

Text-to-Image

{
"success": true,
"model": "stabilityai/stable-diffusion-xl-base-1.0",
"imageBase64": "iVBORw0KGgoAAAANSUhEUgAAA...",
"mimeType": "image/png",
"prompt": "Modern minimalist office interior..."
}

Image Captioning

{
"success": true,
"model": "Salesforce/blip-image-captioning-large",
"caption": "A sleek pair of wireless earbuds displayed on a white background with soft shadows",
"imageUrl": "https://example.com/product-photo.jpg"
}

Image Classification

{
"success": true,
"model": "google/vit-base-patch16-224",
"classifications": [
{"label": "wireless headphones", "score": 0.92},
{"label": "earbuds", "score": 0.06},
{"label": "hearing aid", "score": 0.01}
],
"imageUrl": "https://example.com/photo.jpg"
}

Object Detection

{
"success": true,
"model": "facebook/detr-resnet-50",
"objects": [
{"label": "person", "score": 0.98, "box": {"xmin": 100, "ymin": 50, "xmax": 300, "ymax": 400}},
{"label": "car", "score": 0.95, "box": {"xmin": 400, "ymin": 200, "xmax": 600, "ymax": 350}},
{"label": "dog", "score": 0.87, "box": {"xmin": 150, "ymin": 300, "xmax": 250, "ymax": 420}}
],
"imageUrl": "https://example.com/street-scene.jpg"
}

Image Segmentation

{
"success": true,
"model": "facebook/detr-resnet-50-panoptic",
"segments": [
{"label": "person", "score": 0.95, "mask": "base64_encoded_mask..."},
{"label": "sky", "score": 0.98, "mask": "base64_encoded_mask..."},
{"label": "grass", "score": 0.92, "mask": "base64_encoded_mask..."}
]
}

Depth Estimation

{
"success": true,
"model": "Intel/dpt-large",
"depthMapBase64": "iVBORw0KGgoAAAANSUhEUgAAA...",
"mimeType": "image/png"
}

Pricing (Pay-Per-Event)

EventDescriptionPrice
image_processedPer image task completed$0.01

Example costs:

  • 50 image generations: 50 × $0.01 = $0.50
  • 100 image classifications: 100 × $0.01 = $1.00
  • 200 captions generated: 200 × $0.01 = $2.00
  • Demo mode: $0.00

Note: Hugging Face Pro may be required for some models

Cost Comparison

ToolPer ImageThis Actor
Midjourney~$0.10~$0.01
DALL-E 3~$0.04~$0.01
Leonardo.ai~$0.05~$0.01

Common Scenarios

Scenario 1: Product Image Generation

{
"task": "text_to_image",
"apiToken": "hf_your_token",
"prompt": "Professional product photography of premium leather wallet, studio lighting, white background, 4k quality",
"negativePrompt": "blurry, low quality, watermark, text",
"width": 1024,
"height": 1024,
"guidanceScale": 8.0,
"seed": 42,
"demoMode": false
}

Scenario 2: Auto-Generate Alt Text

{
"task": "image_to_text",
"apiToken": "hf_your_token",
"imageUrl": "https://example.com/blog-header.jpg",
"webhookUrl": "https://hooks.zapier.com/...",
"demoMode": false
}

Scenario 3: Content Moderation

{
"task": "zero_shot_image_classification",
"apiToken": "hf_your_token",
"imageUrl": "https://example.com/user-upload.jpg",
"candidateLabels": "safe content,adult content,violence,spam",
"demoMode": false
}

Scenario 4: Style Transfer

{
"task": "image_to_image",
"apiToken": "hf_your_token",
"imageUrl": "https://example.com/photo.jpg",
"prompt": "Oil painting in the style of Van Gogh, impressionist, vibrant colors",
"strength": 0.75,
"demoMode": false
}

Webhook & Automation Integration

Zapier / Make.com / n8n

  1. Create a webhook trigger
  2. Copy the URL to webhookUrl
  3. Process image results in your workflow

Popular automations:

  • Generated images -> Cloud storage upload
  • Captions -> CMS alt text updates
  • Object detection -> Inventory tagging
  • Classification -> Content moderation queue

Hugging Face AI Suite

ActorBest For
Hugging Face MasterAll-in-one (text + image + audio)
Hugging Face TextText processing
Hugging Face ImageImage processing (lightweight)
Hugging Face AudioAudio processing
Hugging Face HubModel discovery

FAQ

Q: What image formats are supported?

A: JPEG, PNG, WebP. Output is always PNG for quality.

Q: What's the max image size?

A: Input images up to 10MB. Generation up to 1024x1024 (SDXL).

Q: Can I generate multiple images?

A: Run multiple times with different seeds. Use seed parameter for reproducibility.

Q: How do I improve generation quality?

A: Increase numInferenceSteps (50-100), tune guidanceScale (7-12), use detailed prompts.

Q: What's the difference from DALL-E/Midjourney?

A: Similar quality, pay-per-use pricing, more model choices, API-first.

Common Problems & Solutions

"Model is loading"

  • SDXL is large, needs warm-up
  • Set waitForModel: true (default)
  • Consider smaller models for testing

"Image too large"

  • Resize input images before sending
  • Use max 1024x1024 for generation

"Low quality output"

  • Increase numInferenceSteps to 50+
  • Use negative prompts to avoid artifacts
  • Try different guidanceScale values

"Demo data showing"

  • Set demoMode: false
  • Provide your Hugging Face API token

Built by John Rippy | Actor Arsenal