Ocr Pdf Extractor avatar
Ocr Pdf Extractor

Pricing

from $2.00 / 1,000 results

Go to Apify Store
Ocr Pdf Extractor

Ocr Pdf Extractor

Extract text from images and PDFs using OCR. Supports multiple languages including English, Portuguese, Spanish, French, German. Uses Tesseract OCR engine with high accuracy text extraction and word-level confidence scores.

Pricing

from $2.00 / 1,000 results

Rating

0.0

(0)

Developer

Fabio Suizu

Fabio Suizu

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

1

Monthly active users

7 hours ago

Last modified

Categories

Share

OCR & PDF Text Extractor

Extract text from images and PDFs with OCR. Support for 12+ languages, form extraction, and table detection. Powered by Azure AI.

Features

  • Fast Processing: Lightning-fast ocr & pdf text extractor powered by Azure
  • Reliable: 99.9% uptime with automatic failover
  • Scalable: Handle single requests or bulk operations
  • Secure: Enterprise-grade security with API key authentication
  • Well Documented: Comprehensive API documentation and examples

Use Cases

  • E-commerce: Process product images at scale
  • Media: Automate image processing pipelines
  • Apps: Add image processing to your applications

Input Parameters

ParameterTypeRequiredDescription
fileUrlstringNoURL to download image or PDF
fileUrlsarrayNoArray of URLs for bulk extraction
languagestringNoOCR language code
backendstringNoOCR engine to use
extractFormsbooleanNoExtract form fields (key-value pairs)
modestringNoExtraction mode

Output Format

{
"success": true,
"result": { ... },
"timestamp": "2026-01-07T00:00:00Z"
}

Code Examples

JavaScript (Node.js)

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const input = {
"fileUrl": "example_fileUrl",
"fileUrls": [],
"language": "eng",
"backend": "auto",
"extractForms": false,
"mode": "single"
};
const run = await client.actor("vivid_astronaut/ocr-api").call(input);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

Python

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run_input = {
"fileUrl": "example_fileUrl",
"fileUrls": [],
"language": "eng",
"backend": "auto",
"extractForms": false,
"mode": "single"
}
run = client.actor("vivid_astronaut/ocr-api").call(run_input=run_input)
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(item)

cURL

curl -X POST "https://api.apify.com/v2/acts/vivid_astronaut~ocr-api/runs?token=YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"fileUrl": "example_fileUrl",
"fileUrls": [],
"language": "eng",
"backend": "auto",
"extractForms": false,
"mode": "single"
}'

Pricing

Model: Pay per result Price: $0.020 per result

You only pay for successful results. Platform usage costs are included.

API Documentation

Full API documentation is available at:

Support

Version History

See ./CHANGELOG.md for version history.


Powered by Azure Cloud Infrastructure