Pricing

from $0.01 / 1,000 results

Hugging Face Image AI

Image processing w/Hugging Face models Text-to-Image: Stable Diffusion, SDXL, DALL-E generation Image-to-Image: Transform images Inpainting: Edit parts of images Classification: Identify objects Object Detection: Locate label objects Segmentation: Pixel analysis Captioning: Generate descriptions

Pricing

from $0.01 / 1,000 results

Rating

0.0

(0)

Developer

The Howlers

Actor stats

Bookmarked

Total users

Monthly active users

15 days ago

Last modified

Hugging Face Image - AI Image Processing with Stable Diffusion & Vision Models

Generate, transform, analyze, and understand images with state-of-the-art AI models. Stable Diffusion XL for generation, BLIP for captioning, ViT for classification, DETR for object detection, and more. No GPU required - access the best vision models through API. BYOK with your Hugging Face API token.

Features

Text-to-Image - Generate images with Stable Diffusion XL
Image-to-Image - Transform images with prompts
Inpainting - Edit specific parts of images
Image Captioning - Generate descriptions with BLIP
Image Classification - Identify objects with Vision Transformer
Object Detection - Locate objects with DETR
Image Segmentation - Pixel-level scene analysis
Zero-Shot Classification - Classify images without training
Depth Estimation - Generate depth maps
Image Upscaling - 4x super-resolution
Demo Mode - Test with sample data before going live

Who Should Use This Actor?

Marketing Teams

Generate product images. Create social media visuals. Caption images for SEO. A/B test creative variants.

E-commerce Businesses

Generate product photos. Remove backgrounds. Upscale images. Auto-caption for accessibility.

Content Creators

Generate blog illustrations. Transform stock photos. Create custom visuals without designers.

Developers

Add AI image features to apps. Build image analysis pipelines. Integrate vision AI without infrastructure.

Generate post images. Analyze image content. Auto-tag visual content.

Research Teams

Analyze image datasets. Generate synthetic data. Extract visual features.

Quick Start

Demo Mode (Free Test)

{
  "task": "text_to_image",
  "prompt": "Professional product photo of sleek wireless earbuds on white background",
  "demoMode": true
}

Text-to-Image (Stable Diffusion)

{
  "task": "text_to_image",
  "apiToken": "hf_your_token_here",
  "model": "stabilityai/stable-diffusion-xl-base-1.0",
  "prompt": "Modern minimalist office interior, natural lighting, 4k, professional photography",
  "negativePrompt": "blurry, low quality, distorted",
  "width": 1024,
  "height": 1024,
  "guidanceScale": 7.5,
  "numInferenceSteps": 50,
  "demoMode": false
}

Image-to-Image Transformation

{
  "task": "image_to_image",
  "apiToken": "hf_your_token_here",
  "imageUrl": "https://example.com/photo.jpg",
  "prompt": "Transform into watercolor painting style",
  "strength": 0.8,
  "guidanceScale": 7.5,
  "demoMode": false
}

Inpainting (Edit Parts of Images)

{
  "task": "inpainting",
  "apiToken": "hf_your_token_here",
  "imageUrl": "https://example.com/original.jpg",
  "maskUrl": "https://example.com/mask.png",
  "prompt": "A red sports car",
  "guidanceScale": 7.5,
  "demoMode": false
}

Image Captioning

{
  "task": "image_to_text",
  "apiToken": "hf_your_token_here",
  "imageUrl": "https://example.com/product-photo.jpg",
  "demoMode": false
}

Image Classification

{
  "task": "image_classification",
  "apiToken": "hf_your_token_here",
  "imageUrl": "https://example.com/photo.jpg",
  "demoMode": false
}

Object Detection

{
  "task": "object_detection",
  "apiToken": "hf_your_token_here",
  "imageUrl": "https://example.com/street-scene.jpg",
  "demoMode": false
}

Zero-Shot Image Classification

{
  "task": "zero_shot_image_classification",
  "apiToken": "hf_your_token_here",
  "imageUrl": "https://example.com/photo.jpg",
  "candidateLabels": "product photo,lifestyle image,infographic,screenshot",
  "demoMode": false
}

Depth Estimation

{
  "task": "depth_estimation",
  "apiToken": "hf_your_token_here",
  "imageUrl": "https://example.com/scene.jpg",
  "demoMode": false
}

Image Upscaling (4x)

{
  "task": "image_upscaling",
  "apiToken": "hf_your_token_here",
  "imageUrl": "https://example.com/low-res.jpg",
  "prompt": "high quality, detailed, sharp",
  "demoMode": false
}

Input Parameters

Parameter	Type	Default	Description
`task`	string	required	Task to perform (see task list)
`apiToken`	string	-	Your Hugging Face API token
`model`	string	task default	Specific model to use
`prompt`	string	-	Text prompt for generation
`negativePrompt`	string	-	What to avoid in generation
`imageUrl`	string	-	Input image URL
`maskUrl`	string	-	Mask image URL (for inpainting)
`candidateLabels`	string	-	Comma-separated labels (zero-shot)
`width`	number	`1024`	Output image width
`height`	number	`1024`	Output image height
`guidanceScale`	number	`7.5`	CFG scale (prompt adherence)
`numInferenceSteps`	number	`50`	Diffusion steps
`strength`	number	`0.8`	Transformation strength (0-1)
`seed`	number	-	Random seed for reproducibility
`scheduler`	string	`"default"`	Diffusion scheduler
`waitForModel`	boolean	`true`	Wait for model to load
`webhookUrl`	string	-	Webhook URL for results
`demoMode`	boolean	`true`	Return sample data

Available Tasks

Task	Description	Default Model
`text_to_image`	Generate images from text	Stable Diffusion XL
`image_to_image`	Transform images	SDXL Refiner
`inpainting`	Edit image regions	Stable Diffusion XL
`image_to_text`	Generate captions	BLIP-large
`image_classification`	Classify images	ViT-base-patch16
`object_detection`	Detect objects	DETR-ResNet-50
`image_segmentation`	Segment scenes	DETR-panoptic
`zero_shot_image_classification`	Classify with custom labels	CLIP-ViT-large
`depth_estimation`	Generate depth maps	DPT-large
`image_upscaling`	4x super-resolution	SD-x4-upscaler

Output Format

Text-to-Image

{
  "success": true,
  "model": "stabilityai/stable-diffusion-xl-base-1.0",
  "imageBase64": "iVBORw0KGgoAAAANSUhEUgAAA...",
  "mimeType": "image/png",
  "prompt": "Modern minimalist office interior..."
}

Image Captioning

{
  "success": true,
  "model": "Salesforce/blip-image-captioning-large",
  "caption": "A sleek pair of wireless earbuds displayed on a white background with soft shadows",
  "imageUrl": "https://example.com/product-photo.jpg"
}

Image Classification

{
  "success": true,
  "model": "google/vit-base-patch16-224",
  "classifications": [
    {"label": "wireless headphones", "score": 0.92},
    {"label": "earbuds", "score": 0.06},
    {"label": "hearing aid", "score": 0.01}
  ],
  "imageUrl": "https://example.com/photo.jpg"
}

Object Detection

{
  "success": true,
  "model": "facebook/detr-resnet-50",
  "objects": [
    {"label": "person", "score": 0.98, "box": {"xmin": 100, "ymin": 50, "xmax": 300, "ymax": 400}},
    {"label": "car", "score": 0.95, "box": {"xmin": 400, "ymin": 200, "xmax": 600, "ymax": 350}},
    {"label": "dog", "score": 0.87, "box": {"xmin": 150, "ymin": 300, "xmax": 250, "ymax": 420}}
  ],
  "imageUrl": "https://example.com/street-scene.jpg"
}

Image Segmentation

{
  "success": true,
  "model": "facebook/detr-resnet-50-panoptic",
  "segments": [
    {"label": "person", "score": 0.95, "mask": "base64_encoded_mask..."},
    {"label": "sky", "score": 0.98, "mask": "base64_encoded_mask..."},
    {"label": "grass", "score": 0.92, "mask": "base64_encoded_mask..."}
  ]
}

Depth Estimation

{
  "success": true,
  "model": "Intel/dpt-large",
  "depthMapBase64": "iVBORw0KGgoAAAANSUhEUgAAA...",
  "mimeType": "image/png"
}

Pricing (Pay-Per-Event)

Event	Description	Price
`image_processed`	Per image task completed	$0.01

Example costs:

50 image generations: 50 × $0.01 = $0.50
100 image classifications: 100 × $0.01 = $1.00
200 captions generated: 200 × $0.01 = $2.00
Demo mode: $0.00

Note: Hugging Face Pro may be required for some models

Cost Comparison

Tool	Per Image	This Actor
Midjourney	~$0.10	~$0.01
DALL-E 3	~$0.04	~$0.01
Leonardo.ai	~$0.05	~$0.01

Common Scenarios

Scenario 1: Product Image Generation

{
  "task": "text_to_image",
  "apiToken": "hf_your_token",
  "prompt": "Professional product photography of premium leather wallet, studio lighting, white background, 4k quality",
  "negativePrompt": "blurry, low quality, watermark, text",
  "width": 1024,
  "height": 1024,
  "guidanceScale": 8.0,
  "seed": 42,
  "demoMode": false
}

Scenario 2: Auto-Generate Alt Text

{
  "task": "image_to_text",
  "apiToken": "hf_your_token",
  "imageUrl": "https://example.com/blog-header.jpg",
  "webhookUrl": "https://hooks.zapier.com/...",
  "demoMode": false
}

Scenario 3: Content Moderation

{
  "task": "zero_shot_image_classification",
  "apiToken": "hf_your_token",
  "imageUrl": "https://example.com/user-upload.jpg",
  "candidateLabels": "safe content,adult content,violence,spam",
  "demoMode": false
}

Scenario 4: Style Transfer

{
  "task": "image_to_image",
  "apiToken": "hf_your_token",
  "imageUrl": "https://example.com/photo.jpg",
  "prompt": "Oil painting in the style of Van Gogh, impressionist, vibrant colors",
  "strength": 0.75,
  "demoMode": false
}

Webhook & Automation Integration

Zapier / Make.com / n8n

Create a webhook trigger
Copy the URL to webhookUrl
Process image results in your workflow

Popular automations:

Generated images -> Cloud storage upload
Captions -> CMS alt text updates
Object detection -> Inventory tagging
Classification -> Content moderation queue

Hugging Face AI Suite

Actor	Best For
Hugging Face Master	All-in-one (text + image + audio)
Hugging Face Text	Text processing
Hugging Face Image	Image processing (lightweight)
Hugging Face Audio	Audio processing
Hugging Face Hub	Model discovery

FAQ

Q: What image formats are supported?

A: JPEG, PNG, WebP. Output is always PNG for quality.

Q: What's the max image size?

A: Input images up to 10MB. Generation up to 1024x1024 (SDXL).

Q: Can I generate multiple images?

A: Run multiple times with different seeds. Use seed parameter for reproducibility.

Q: How do I improve generation quality?

A: Increase numInferenceSteps (50-100), tune guidanceScale (7-12), use detailed prompts.

Q: What's the difference from DALL-E/Midjourney?

A: Similar quality, pay-per-use pricing, more model choices, API-first.

Common Problems & Solutions

"Model is loading"

SDXL is large, needs warm-up
Set waitForModel: true (default)
Consider smaller models for testing

"Image too large"

Resize input images before sending
Use max 1024x1024 for generation

"Low quality output"

Increase numInferenceSteps to 50+
Use negative prompts to avoid artifacts
Try different guidanceScale values

"Demo data showing"

Set demoMode: false
Provide your Hugging Face API token

📞 Support

Actor Arsenal: Full Actor Catalog
Developer: John Rippy

Built by John Rippy | Actor Arsenal

Hugging Face Master

alizarin_refrigerator-owner/hugging-face-master

Unified Apify actor for Hugging Face Inference API access 200K+ AI models for text image audio processing Text Generation LLMs Llama Summarization Condense documents Translation 100+ languages Sentiment Analysis Image Generation Stable Diffusion Speech transcription Semantic search QA classification

The Howlers

image to image

evoort-solutions-llc/image-to-image

Evoort Solutions LLC

Image To Text

calm_necessity/image-to-text

Image to Text Actor analyzes images and generates detailed text descriptions of scenes, objects, and visual context. Upload an image and receive a human-readable explanation of what the image contains. Ideal for accessibility, content understanding, and automation workflows.

Taher Ali Badnawarwala

Google Images Scraper

hooli/google-images-scraper

Scrape image details from images.google.com. Add your query and number of images and extract image details such as image URL, image source, description, image dimensions, thumbnail, and more. Export scraped data, run the scraper via API, schedule and monitor runs, or integrate with other tools.

Hooli

3.6K

4.4

Image to Prompt Generator 🎨 ✨

easyapi/image-to-prompt-generator

Transform any image into detailed text descriptions using AI. Perfect for content creators, SEO specialists, and developers who need automated image-to-text conversion. Powered by Phot.ai's advanced image recognition technology.

EasyApi

Search By Image Aliexpress

freecamp008/search-by-image-aliexpress

The Reverse Image Search is an AliExpress Search By Image API that allows you to locate products by image URL alone. Skip the keywords and get immediate matches, pricing, and seller data using cutting-edge image search technology.

Camp8 fr0

5.0

Image Analyzer

terryong30/image-analyzer

Terry Ong

Image Scraper

rapidtech1898/image-scraper

Extract image links from any website quickly and easily. Enter a URL and the scraper collects all available image URLs in seconds. Perfect for designers, marketers, and developers who need fast access to image sources without manual searching.

Max Pohler

1.0

Hugging Face Model & Dataset Scraper

cloud9_ai/huggingface-scraper

Search and extract ML models and datasets from Hugging Face Hub. Get model cards, download stats, tasks, and architectures. No API key needed.

cloud9

AI Image Studio - Multi-Provider Generator

alizarin_refrigerator-owner/ai-image-studio

Unified AI image generation across multiple providers: DALL-E 3, Stable Diffusion, Midjourney & Imagen. Compare results, optimize costs & generate consistent brand imagery at scale.

The Howlers

Hugging Face Image AI

Hugging Face Image - AI Image Processing with Stable Diffusion & Vision Models

Features

Who Should Use This Actor?

Marketing Teams

E-commerce Businesses

Content Creators

Developers

Social Media Managers

Research Teams

Quick Start

Demo Mode (Free Test)

Text-to-Image (Stable Diffusion)

Image-to-Image Transformation

Inpainting (Edit Parts of Images)

Image Captioning

Image Classification

Object Detection

Zero-Shot Image Classification

Depth Estimation

Image Upscaling (4x)

Input Parameters

Available Tasks

Output Format

Text-to-Image

Image Captioning

Image Classification

Object Detection

Image Segmentation

Depth Estimation

Pricing (Pay-Per-Event)

Cost Comparison

Common Scenarios

Scenario 1: Product Image Generation

Scenario 2: Auto-Generate Alt Text

Scenario 3: Content Moderation

Scenario 4: Style Transfer

Webhook & Automation Integration

Zapier / Make.com / n8n

Hugging Face AI Suite

FAQ

Q: What image formats are supported?

Q: What's the max image size?

Q: Can I generate multiple images?

Q: How do I improve generation quality?

Q: What's the difference from DALL-E/Midjourney?

Common Problems & Solutions

"Model is loading"

"Image too large"

"Low quality output"

"Demo data showing"

📞 Support

You might also like

Hugging Face Master

image to image

Image To Text

Google Images Scraper

Image to Prompt Generator 🎨 ✨

Search By Image Aliexpress

Image Analyzer

Image Scraper

Hugging Face Model & Dataset Scraper

AI Image Studio - Multi-Provider Generator

Related articles