Image to Text OCR
Pricing
Pay per usage
Go to Apify Store
Deprecated
Image to Text OCR
Extract machine readable textual data from image documents
Pricing
Pay per usage
Rating
0.0
(0)
Developer

Valeh Farzaliyev
Maintained by Community
Actor stats
4
Bookmarked
73
Total users
2
Monthly active users
4 years ago
Last modified
Categories
Share
Actor - Image to Text
The actor takes an input image in a specified format (base64 or url) and using asked Optic Character Recognition (OCR) model (PaddleOCR or Tesseract) extracts textual data in required language (See OCR model documentations for available languages). The result is saved into Key-Value store as one of output formats (pdf, txt or bbox)
INPUT
Input of this actor should be JSON file with following fields:
| Field | Type | Description | Allowed values |
|---|---|---|---|
| input_type | String | Input image format | base64 or url |
| input_image | String | Image | Any valid string value |
| language | String | Text language | See OCR model documentations (e.g en) |
| ocr | String | Specific OCR model | paddle or tesseract |
| output_format | String | Desired output format | bbox/pdf for PaddleOCR or txt/pdf for Tesseract |
Sample Input
{"input_type": "url","input_image": "https://images4.programmersought.com/934/e8/e89758ae0ed991f1c8aba947addec9e6.png","lang": "eng","ocr": "tesseract","output_format": "txt"}
OUTPUT
Once the actor finishes, it will output a textual data in specified format.
- bbox : list of bounding boxes and text inside
- pdf : Base64 encoded pdf file
- txt : String text
Sample Output
{"response": "Sample PDF Document\n\nRobert Maron\nGrzegorz. Grudziriski\n\nFebruary 20, 1999\n\x0c","error": None}