Pdf OCR API
Pricing
Pay per event
Pdf OCR API
Extract and convert text from PDF documents using advanced optical character recognition technology with support for multiple AI models.
Pricing
Pay per event
Extract and convert text from PDF documents using advanced optical character recognition technology with support for multiple AI models.
ocrModelEnumRequired
Select the OCR model to use for text extraction
"google-vision": string"deepseek-ocr": string"amazon-textract": string"azure-vision": string"openai": string"huggingface": string"gemini": string"native": stringDefault value of this property is "native"
pdfUrlsarrayRequired
Array of PDF document URLs to process
Default value of this property is ["https://pdfobject.com/pdf/sample.pdf"]
languageEnumOptional
Primary language of the documents
"eng": string"spa": string"fra": string"deu": string"ita": string"por": string"rus": string"chi_sim": string"jpn": string"kor": string"ara": string"dan": stringDefault value of this property is "eng"
preserveFormattingbooleanOptional
Maintain document layout and structure
Default value of this property is true
extractImagesbooleanOptional
Extract and process images from PDF
Default value of this property is false
outputFormatEnumOptional
Format for extracted text
"json": string"text": string"markdown": stringDefault value of this property is "json"
pageRangestringOptional
Specific pages to process (e.g., '1-5' or '1,3,5' or 'all')
Default value of this property is "all"