OCR Structured Extractor (AI) — Image/PDF → OCR Text + JSON
Pricing
Pay per usage
Go to Apify Store

OCR Structured Extractor (AI) — Image/PDF → OCR Text + JSON
Extract OCR text and structured JSON from an image or PDF URL. Great for invoices, receipts, forms, IDs, and tables. Powered by Gemini 3 Pro.
Pricing
Pay per usage
Rating
0.0
(0)
Developer

Anass Seb
Maintained by Community
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
6 hours ago
Last modified
Categories
Share
OCR Structured Extractor (AI) — Image/PDF → OCR Text + Structured JSON


Extract OCR text and structured fields from an image URL or PDF URL using Gemini 3 Pro (through the same proxy Worker used by the other AI actors in this repo).
Keywords (SEO)
ocr api, pdf ocr, image ocr, pdf to json, image to json, invoice ocr, receipt ocr, form extraction, document understanding, gemini ocr, structured extraction, data extraction, ai document parser, id card ocr, table extraction
How it works
- Downloads your file from
fileUrl - Sends the bytes as
inlineDatatomodels/{model}:generateContent(JSON mode) - Parses the model response and outputs:
text: full OCR transcriptiondata: structured fields (either a default structure or yourextractionSchema)
Best for
- Invoices, receipts, and utility bills (key-value extraction)
- Forms and screenshots (clean OCR + structured fields)
- PDFs that mix text, tables, and images (document understanding)
- Identity documents (IDs, passports) and card-style layouts
Input
fileUrl(string, required): Public URL to an image (png/jpg/webp) or a PDFinstructions(string, optional): Extraction instructions for the modelextractionSchema(object, optional): JSON object describing the structure you want indatamodel(string, defaultgemini-3-pro-preview)maxBytes(int, default52428800): Max size to download (PDF inline is commonly limited to 50MB)
Supported file types
- Images:
image/png,image/jpeg,image/webp(and otherimage/*types if the server reports a correct MIME type) - Documents:
application/pdf
Output
The Actor stores:
- Dataset: one item with
fileUrl,mimeType,text, anddata - Key-value store exports:
ocr.json(full JSON output)ocr.txt(OCR text only, if available)
Dataset item example:
{"fileUrl": "https://example.com/invoice.pdf","mimeType": "application/pdf","model": "gemini-3-pro-preview","text": "Invoice #INV-1002 ...","data": {"summary": "Invoice from ACME Corp for January services.","key_value_pairs": [{ "key": "Invoice Number", "value": "INV-1002" },{ "key": "Total", "value": "$1,249.00" }]}}
Example input (custom schema)
{"fileUrl": "https://example.com/receipt.jpg","instructions": "Extract receipt line items and totals. Return ONLY JSON.","extractionSchema": {"merchant": "string","date": "string","currency": "string","total": "string","items": [{ "name": "string", "qty": "string", "price": "string" }]}}
Prompt tips
- For invoices/receipts: ask for
merchant,invoice_number,date,currency,subtotal,tax,total,items[] - For IDs: ask for
full_name,document_number,dob,expiry_date - If the document has tables, ask for
rowswith normalized columns