Vision OCR MCP
Pricing
from $0.99 / 1,000 results
Vision OCR MCP
Extract text from images instantly. Turn receipts, invoices, documents, and handwritten notes into structured data.
Pricing
from $0.99 / 1,000 results
Rating
5.0
(1)
Developer

Acceleration
Actor stats
0
Bookmarked
2
Total users
2
Monthly active users
a day ago
Last modified
Categories
Share
Vision OCR MCP Server
A Model Context Protocol server for extracting text from images. This server enables LLMs to read invoices, receipts, and documents in 100+ languages while preserving the original script.
About this MCP Server: To understand how to connect to and utilize this MCP server, please refer to the official Model Context Protocol documentation at mcp.apify.com.
Connection URL
MCP clients can connect to this server at:
https://accelerationengg--vision-ocr-mcp.apify.actor/mcp
Client Configuration
To connect to this MCP server, use the following configuration in your MCP client:
{"mcpServers": {"vision-ocr": {"url": "https://accelerationengg--vision-ocr-mcp.apify.actor/mcp","headers": {"Authorization": "Bearer YOUR_APIFY_TOKEN"}}}}
Note: Replace YOUR_APIFY_TOKEN with your actual Apify API token. You can find your token in the Apify Console.
Claude Desktop Configuration
To use this MCP server with Claude Desktop, add the following configuration to your Claude Desktop settings:
Location: ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows)
{"mcpServers": {"apifyVisionOCR": {"command": "npx","args": ["-y","mcp-remote","https://accelerationengg--vision-ocr-mcp.apify.actor/mcp","--header","Authorization: Bearer YOUR_APIFY_TOKEN"]}}}
Steps:
- Open Claude Desktop configuration file at the location above
- Add the configuration with your Apify API token (replace
YOUR_APIFY_TOKEN) - Save the file
- Restart Claude Desktop
- The
vision_ocrtool will now be available in your conversations
Available Tools
vision_ocr - Extracts structured data from images with language detection and price extraction.
Parameters:
images(array, required) - List of image URLs, file paths, or base64 strings (max 15)output_format(string, optional) -"json"(default) or"toon"for compact output
Returns:
{"language_detected": "ur","description_text": "رسید | تاریخ: ۲۰۲۶-۰۱-۰۴ | چائے","price_1": "₨۵۰","price_2": "₨۱۲۵"}
Features
✅ Multilingual OCR - Urdu (اردو), Arabic (العربية), English, Chinese (中文), and 100+ languages
✅ Price Detection - Automatically extracts prices from invoices/receipts
✅ Layout Preservation - Maintains tables and columns with "|" separators
✅ Batch Processing - Process up to 15 images in parallel
✅ Fast - 12-15 seconds per image
Supported Formats
Images: PNG, JPG, JPEG, WEBP (GIF not supported)
Languages: 100+ including Urdu, Arabic, English, Chinese, Hindi, Spanish, French, German
Output Formats
The server supports two output formats optimized for different use cases:
JSON Format (Default)
Standard structured output - easiest to parse and integrate with applications.
Example Output:
{"model": "Qwen/Qwen3-VL-30B","image_count": 1,"total_time_seconds": 3.91,"results": [{"index": 0,"data": {"language_detected": "ar","description_text": "TURKISH CORNER Date:6/10/2019 Time:6:56 PM Table:B12 Ticket No:243 -1Homus حمص 1-Mutabel متبل 1-Baba Ghanouj بابا غنوج 1-Fatoush فتوش 1-Olive Salad سلطة زيتون 1-Green Salad سلطة خضراء 1-Grapes Leaves ورق عنب 1-Tabouleh تبولة 1-Vegetable with Youghurt Salad سلطة خضار باللبن 1-Hot Salad سلطة حارة Total: 8.00 Cash 8.00 THANK YOU","price_1": "8.00","price_2": "8.00"},"processing_time": 3.91}]}
TOON Format (Token-Efficient)
Compact notation that saves ~30% tokens - ideal for LLM processing and cost optimization.
Example Output:
model: Qwen/Qwen3-VL-30B-A3B-Instructimage_count: 1total_time_seconds: 3.76results:[1]{index,data,processing_time}:0,{'language_detected': 'ar', 'description_text': 'TURKISH CORNER Date:6/10/2019 Time:6:56 PM Table:B12 Ticket No:243 -1Homus حمص 1-Mutabel متبل 1-Baba Ghanouj بابا غنوج 1-Fatoush فتوش 1-Olive Salad سلطة زيتون 1-Green Salad سلطة خضراء 1-Grapes Leaves ورق عنب 1-Tabouleh تبولة 1-Vegetable with Youghurt Salad سلطة خضار باللبن 1-Hot Salad سلطة حارة Total: 8.00 Cash 8.00 THANK YOU', 'price_1': '8.00', 'price_2': '8.00'},3.76
When to use each format:
- JSON: Standard API integration, automated parsing, strict schema validation
- TOON: Sending to LLMs for analysis, reducing token costs, human-readable logs
Use Cases
- Financial documents: Invoices, receipts, bills
- Multi-column tables: Spreadsheets, reports
- Multilingual documents: Documents with Arabic, Urdu, Chinese, and other scripts
- Form extraction: Structured data from forms
Python API Usage
Installation
$pip install apify-client
Basic Example
from apify_client import ApifyClientclient = ApifyClient("YOUR_APIFY_TOKEN")# Extract text from imagerun_input = {"images": ["https://example.com/receipt.jpg"],"output_format": "json" # or "toon"}run = client.actor("accelerationengg/vision-ocr-mcp").call(run_input=run_input)# Get resultsfor item in client.dataset(run["defaultDatasetId"]).iterate_items():text = item['results'][0]['data']['description_text']language = item['results'][0]['data']['language_detected']print(f"Language: {language}\nText: {text}")
Batch Processing
# Process multiple imagesrun_input = {"images": ["https://example.com/invoice1.jpg","https://example.com/invoice2.jpg","https://example.com/invoice3.jpg"],"output_format": "json"}run = client.actor("accelerationengg/vision-ocr-mcp").call(run_input=run_input)for item in client.dataset(run["defaultDatasetId"]).iterate_items():for result in item['results']:print(f"Image {result['index']}: {result['data']['description_text'][:100]}...")
TOON Format for LLM Processing
# Use TOON format to save ~30% tokensrun_input = {"images": ["https://example.com/receipt.jpg"],"output_format": "toon"}run = client.actor("accelerationengg/vision-ocr-mcp").call(run_input=run_input)for item in client.dataset(run["defaultDatasetId"]).iterate_items():toon_output = item['content']# Send directly to Claude for analysis# Uses ~30% fewer tokens than JSON# response = claude.messages.create(# model="claude-3-5-sonnet-20241022",# messages=[{# "role": "user",# "content": f"Analyze this receipt:\n{toon_output}"# }]# )
Example Usage
Single Image
Extract text from this receipt:https://example.com/receipt.jpg
Multiple Images
Process these invoices:- https://example.com/invoice1.jpg- https://example.com/invoice2.jpg- https://example.com/invoice3.jpg
Built with Qwen-VL, FastMCP, Apify