OCR Structured Extractor (AI) — Image/PDF → OCR Text + JSON avatar
OCR Structured Extractor (AI) — Image/PDF → OCR Text + JSON

Pricing

Pay per usage

Go to Apify Store
OCR Structured Extractor (AI) — Image/PDF → OCR Text + JSON

OCR Structured Extractor (AI) — Image/PDF → OCR Text + JSON

Extract OCR text and structured JSON from an image or PDF URL. Great for invoices, receipts, forms, IDs, and tables. Powered by Gemini 3 Pro.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Anass Seb

Anass Seb

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

6 hours ago

Last modified

Share

OCR Structured Extractor (AI) — Image/PDF → OCR Text + Structured JSON

OCR Structured Extractor icon

OCR Structured Extractor banner

Extract OCR text and structured fields from an image URL or PDF URL using Gemini 3 Pro (through the same proxy Worker used by the other AI actors in this repo).

Keywords (SEO)

ocr api, pdf ocr, image ocr, pdf to json, image to json, invoice ocr, receipt ocr, form extraction, document understanding, gemini ocr, structured extraction, data extraction, ai document parser, id card ocr, table extraction

How it works

  1. Downloads your file from fileUrl
  2. Sends the bytes as inlineData to models/{model}:generateContent (JSON mode)
  3. Parses the model response and outputs:
    • text: full OCR transcription
    • data: structured fields (either a default structure or your extractionSchema)

Best for

  • Invoices, receipts, and utility bills (key-value extraction)
  • Forms and screenshots (clean OCR + structured fields)
  • PDFs that mix text, tables, and images (document understanding)
  • Identity documents (IDs, passports) and card-style layouts

Input

  • fileUrl (string, required): Public URL to an image (png/jpg/webp) or a PDF
  • instructions (string, optional): Extraction instructions for the model
  • extractionSchema (object, optional): JSON object describing the structure you want in data
  • model (string, default gemini-3-pro-preview)
  • maxBytes (int, default 52428800): Max size to download (PDF inline is commonly limited to 50MB)

Supported file types

  • Images: image/png, image/jpeg, image/webp (and other image/* types if the server reports a correct MIME type)
  • Documents: application/pdf

Output

The Actor stores:

  • Dataset: one item with fileUrl, mimeType, text, and data
  • Key-value store exports:
    • ocr.json (full JSON output)
    • ocr.txt (OCR text only, if available)

Dataset item example:

{
"fileUrl": "https://example.com/invoice.pdf",
"mimeType": "application/pdf",
"model": "gemini-3-pro-preview",
"text": "Invoice #INV-1002 ...",
"data": {
"summary": "Invoice from ACME Corp for January services.",
"key_value_pairs": [
{ "key": "Invoice Number", "value": "INV-1002" },
{ "key": "Total", "value": "$1,249.00" }
]
}
}

Example input (custom schema)

{
"fileUrl": "https://example.com/receipt.jpg",
"instructions": "Extract receipt line items and totals. Return ONLY JSON.",
"extractionSchema": {
"merchant": "string",
"date": "string",
"currency": "string",
"total": "string",
"items": [
{ "name": "string", "qty": "string", "price": "string" }
]
}
}

Prompt tips

  • For invoices/receipts: ask for merchant, invoice_number, date, currency, subtotal, tax, total, items[]
  • For IDs: ask for full_name, document_number, dob, expiry_date
  • If the document has tables, ask for rows with normalized columns