Pricing

from $0.01 / 1,000 results

Try for free

Go to Apify Store

Receipt OCR API

Try for free

Receipt OCR API - Multi-Model Text Extraction : Extract structured data from receipt images using advanced OCR technology with support for multiple AI models including Google Vision, OpenAI, Azure, AWS Textract, Gemini, Hugging Face, DeepSeek, and Native OCR.

Pricing

from $0.01 / 1,000 results

Rating

5.0

(4)

Developer

HappiTap

Actor stats

Bookmarked

Total users

Monthly active users

4 months ago

Last modified

Receipt OCR API - Multi-Model Text Extraction

Extract structured data from receipt images using advanced OCR technology with support for multiple AI models including Google Vision, OpenAI, Azure, AWS Textract, Gemini, Hugging Face, DeepSeek, and Native OCR.

🌟 Features

Multi-Model OCR Support

Choose from 8 different OCR engines based on your needs:

Google Vision API - High accuracy, excellent for printed receipts
DeepSeek OCR - Advanced AI-powered text extraction
Amazon Textract - Specialized for document and receipt analysis
Azure AI Vision - Microsoft's computer vision service
OpenAI GPT-4 Vision - State-of-the-art vision model
Hugging Face - Open-source OCR models
Google Gemini - Latest Google multimodal AI
Native (Tesseract.js) - Free, no API key required

Intelligent Data Extraction

Merchant Information: Name, address, contact details
Transaction Details: Date, time, receipt number
Financial Data: Total amount, subtotal, tax, currency
Line Items: Individual items with prices
Payment Method: Credit card, cash, etc.

Advanced Features

✅ Automatic Calculation Verification - Validates totals and tax amounts
📊 Batch Processing - Process multiple receipts simultaneously
🔄 Multi-Format Support - JPG, PNG, PDF files
📋 Structured JSON Output - Machine-readable data format
🎯 High Accuracy - Advanced parsing algorithms

🚀 Quick Start

Input Configuration

{
  "ocrModel": "native",
  "receiptUrls": [
    "https://example.com/receipt1.jpg",
    "https://example.com/receipt2.png"
  ],
  "extractLineItems": true,
  "verifyCalculations": true,
  "outputFormat": "detailed"
}

Required API Keys by Model

Model	Required Fields
Google Vision	`googleVisionApiKey`
DeepSeek OCR	`deepseekApiKey`
Amazon Textract	`awsAccessKeyId`, `awsSecretAccessKey`, `awsRegion`
Azure AI Vision	`azureEndpoint`, `azureApiKey`
OpenAI	`openaiApiKey`
Hugging Face	`huggingfaceApiKey`
Gemini	`geminiApiKey`
Native	None (uses Tesseract.js)

📋 Input Parameters

Required Parameters

ocrModel (string) - OCR model to use
- Options: google-vision, deepseek-ocr, amazon-textract, azure-vision, openai, huggingface, gemini, native
- Default: native
receiptUrls (array) - Array of receipt image URLs
- Supports: HTTP/HTTPS URLs, data URLs, Apify key-value store URLs
- Formats: JPG, PNG, PDF

Optional Parameters

extractLineItems (boolean) - Extract individual line items
- Default: true
verifyCalculations (boolean) - Verify totals and tax calculations
- Default: true
outputFormat (string) - Output data format
- Options: json (compact), detailed (with metadata)
- Default: detailed

API Keys (Model-Specific)

Configure the appropriate API keys based on your selected OCR model. See the table above for required fields.

📤 Output Format

Detailed Output Example

{
  "receiptUrl": "https://example.com/receipt.jpg",
  "ocrModel": "google-vision",
  "success": true,
  "extractedAt": "2024-01-15T10:30:00.000Z",
  "merchantName": "SuperMart Store",
  "merchantAddress": "123 Main Street, City, State 12345",
  "date": "01/15/2024",
  "time": "10:25 AM",
  "receiptNumber": "TXN-12345",
  "currency": "USD",
  "subtotal": 45.50,
  "tax": 3.64,
  "totalAmount": 49.14,
  "paymentMethod": "Credit Card",
  "lineItems": [
    {
      "name": "Product A",
      "price": 15.99
    },
    {
      "name": "Product B",
      "price": 29.51
    }
  ],
  "calculationVerification": {
    "isValid": true,
    "errors": []
  },
  "metadata": {
    "confidence": 0.95,
    "processingTime": 1234,
    "imageSize": 524288
  },
  "rawText": "SuperMart Store\n123 Main Street..."
}

🎯 Use Cases

Expense Management

Automate receipt data entry for expense reports
Track business expenses in real-time
Integrate with accounting software

Accounting & Bookkeeping

Digitize paper receipts for record keeping
Verify transaction details automatically
Generate financial reports from receipt data

E-commerce & Retail

Receipt verification systems
Customer purchase tracking
Warranty and return management

Fintech Applications

Personal finance tracking apps
Budget management tools
Tax preparation software

🔧 Model Comparison

Model	Speed	Accuracy	Cost	Best For
Native	⚡⚡⚡	⭐⭐⭐	Free	Testing, low volume
Google Vision	⚡⚡	⭐⭐⭐⭐⭐	$$	High accuracy needs
Amazon Textract	⚡⚡	⭐⭐⭐⭐⭐	$$	Receipt-specific
OpenAI	⚡	⭐⭐⭐⭐⭐	$$$	Complex receipts
Azure Vision	⚡⚡	⭐⭐⭐⭐	$$	Microsoft ecosystem
Gemini	⚡⚡	⭐⭐⭐⭐	$$	Latest AI tech
DeepSeek	⚡⚡	⭐⭐⭐⭐	$$	Alternative to OpenAI
Hugging Face	⚡⚡	⭐⭐⭐	$	Open-source models

🔐 Security & Privacy

All API keys are stored securely as secrets
Images are processed in memory and not permanently stored
Supports private/internal image URLs
GDPR and data privacy compliant

💡 Tips for Best Results

Image Quality: Use high-resolution, well-lit images
Format: Straight, unfolded receipts work best
Contrast: Ensure good contrast between text and background
Model Selection:
- Use Native for testing and low-volume processing
- Use Google Vision or Textract for production workloads
- Use OpenAI for complex or damaged receipts

🐛 Error Handling

The actor handles various error scenarios:

Invalid or unreachable image URLs
OCR processing failures
Missing or invalid API keys
Malformed receipt data

Each result includes a success field and error message when applicable.

📊 Batch Processing

Process multiple receipts in a single run:

{
  "ocrModel": "google-vision",
  "receiptUrls": [
    "https://example.com/receipt1.jpg",
    "https://example.com/receipt2.jpg",
    "https://example.com/receipt3.jpg"
  ]
}

The actor will process all receipts and provide individual results for each.

🔗 Integration

API Integration

const Apify = require('apify-client');

const client = new Apify.ApifyClient({
    token: 'YOUR_API_TOKEN',
});

const run = await client.actor('YOUR_ACTOR_ID').call({
    ocrModel: 'google-vision',
    googleVisionApiKey: 'YOUR_GOOGLE_API_KEY',
    receiptUrls: ['https://example.com/receipt.jpg'],
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

Webhook Integration

Configure webhooks to receive results automatically when processing completes.

📈 Performance

Processing Speed: 2-10 seconds per receipt (varies by model)
Concurrent Processing: Up to 10 receipts simultaneously
Maximum Image Size: 50MB per image
Supported Formats: JPG, PNG, PDF

🆘 Support

For issues, questions, or feature requests:

Check the Apify documentation
Review the input schema for parameter details
Ensure API keys are valid and have sufficient quota

🔄 Version History

v1.0.0

Initial release
Support for 8 OCR models
Intelligent receipt parsing
Batch processing
Calculation verification
Multi-format image support

Built with ❤️ using Apify Platform

Receipt Ocr API

vivid_astronaut/receipt-ocr

Fabio Suizu

Receipt Scanner

confidential_sand/receipt-scanner

Extract store name, date, total, items and more from receipt images or PDFs using AI-powered OCR. Ideal for expense tracking, finance automation, and data extraction workflows. Handles messy real-world formats with high accuracy.

Artur Malev

OCR Structured Extractor (AI) — Image/PDF → OCR Text + JSON

macheta/ocr-structured-extractor

Extract OCR text and structured JSON from an image or PDF URL. Great for invoices, receipts, forms, IDs, and tables. Powered by Gemini 3 Pro.

Anass

Ocr Pdf Extractor

vivid_astronaut/ocr-pdf-extractor

Extract text from images and PDFs using OCR. Supports multiple languages including English, Portuguese, Spanish, French, German. Uses Tesseract OCR engine with high accuracy text extraction and word-level confidence scores.

Fabio Suizu

Ocr

vivid_astronaut/ocr

Extract text from images using advanced OCR technology. Supports multiple languages and image formats. Perfect for digitizing documents, receipts, screenshots, and scanned text.

Fabio Suizu

Passport Ocr API

vivid_astronaut/passport-ocr

Fabio Suizu

PDF OCR API - Document Extraction

alizarin_refrigerator-owner/pdf-ocr-api

Extract text from PDFs including scanned documents. OCR processing, table extraction & structured data output. Process invoices, contracts & forms at scale.