Ocr Pdf Extractor
Pricing
from $2.00 / 1,000 results
Ocr Pdf Extractor
Extract text from images and PDFs using OCR. Supports multiple languages including English, Portuguese, Spanish, French, German. Uses Tesseract OCR engine with high accuracy text extraction and word-level confidence scores.
Pricing
from $2.00 / 1,000 results
Rating
0.0
(0)
Developer

Fabio Suizu
Actor stats
0
Bookmarked
3
Total users
1
Monthly active users
7 hours ago
Last modified
Categories
Share
OCR & PDF Text Extractor
Extract text from images and PDFs with OCR. Support for 12+ languages, form extraction, and table detection. Powered by Azure AI.
Features
- Fast Processing: Lightning-fast ocr & pdf text extractor powered by Azure
- Reliable: 99.9% uptime with automatic failover
- Scalable: Handle single requests or bulk operations
- Secure: Enterprise-grade security with API key authentication
- Well Documented: Comprehensive API documentation and examples
Use Cases
- E-commerce: Process product images at scale
- Media: Automate image processing pipelines
- Apps: Add image processing to your applications
Input Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
fileUrl | string | No | URL to download image or PDF |
fileUrls | array | No | Array of URLs for bulk extraction |
language | string | No | OCR language code |
backend | string | No | OCR engine to use |
extractForms | boolean | No | Extract form fields (key-value pairs) |
mode | string | No | Extraction mode |
Output Format
{"success": true,"result": { ... },"timestamp": "2026-01-07T00:00:00Z"}
Code Examples
JavaScript (Node.js)
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });const input = {"fileUrl": "example_fileUrl","fileUrls": [],"language": "eng","backend": "auto","extractForms": false,"mode": "single"};const run = await client.actor("vivid_astronaut/ocr-api").call(input);const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(items);
Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")run_input = {"fileUrl": "example_fileUrl","fileUrls": [],"language": "eng","backend": "auto","extractForms": false,"mode": "single"}run = client.actor("vivid_astronaut/ocr-api").call(run_input=run_input)for item in client.dataset(run["defaultDatasetId"]).iterate_items():print(item)
cURL
curl -X POST "https://api.apify.com/v2/acts/vivid_astronaut~ocr-api/runs?token=YOUR_API_TOKEN" \-H "Content-Type: application/json" \-d '{"fileUrl": "example_fileUrl","fileUrls": [],"language": "eng","backend": "auto","extractForms": false,"mode": "single"}'
Pricing
Model: Pay per result Price: $0.020 per result
You only pay for successful results. Platform usage costs are included.
API Documentation
Full API documentation is available at:
Support
- Issues: Report bugs via Apify Console
- Documentation: Apify Docs
- Community: Apify Discord
Version History
See ./CHANGELOG.md for version history.
Powered by Azure Cloud Infrastructure