Image To Json Extractor avatar
Image To Json Extractor
Try for free

3 days trial then $5.00/month - No credit card required now

View all Actors
Image To Json Extractor

Image To Json Extractor

Try for free

3 days trial then $5.00/month - No credit card required now

AI-Powered Image to JSON Data Extractor. Utilize cutting-edge AI to transform image content into structured JSON data effortlessly. Perfect for automating data extraction from visual content and streamlining workflows.


The "Image To Json Extractor" is an AI-powered Apify actor designed to automate the extraction of data from images and convert it into a structured JSON format. Leveraging advanced AI algorithms, this actor can intelligently analyze images, recognize text and text structures (e.g. tables), and transform this content into customizable JSON output. Developed to streamline data processing tasks, it eliminates manual data entry and enhances data accuracy and efficiency.

Use Cases

This actor is incredibly versatile and can be used across various scenarios, including but not limited to:

  • Document Automation: Automatically extract text from scanned documents, invoices, or receipts for easy data management and analysis.
  • Content Management: Extract and structure data from images for content management systems, media platforms, enhancing SEO and content discoverability.
  • E-commerce & Retail: Convert product page images into detailed JSON data for inventory management, product descriptions, and online catalogues.
  • Research and Development: Facilitate data collection and analysis from scientific images, charts, and graphs for research purposes.
  • Making Content Accessible: Help people who use screen readers by turning text in images into a format they can listen to.
  • Web Content Extraction: Efficiently extract text from images across web apps, websites, social media, ads, and banners. Ideal for content analysis, monitoring, and archiving from various online sources.
  • Standardized Data Gathering: Streamline data extraction from documents of similar types but different designs and formats. Ensures consistent data output for forms, reports, and more, facilitating easier integration and analysis.


The actor accepts the following inputs, allowing for flexible and tailored data extraction:

  • Image Source Type: Specify the type of source provided in the image (e.g., invoice, receipt, website screenshot etc. ) to tailor the extraction process.
  • Source Text Language: The ISO 639-3 language code of the source for accurate text recognition.
  • Extraction Data Schema: Defines the schema for the data you wish to extract. Use our web tool for schema creation: Schema Generator.
  • Image URL: The publicly accessible URL of the source image to be processed.
  • OpenAI Service API Key: Your API key for accessing OpenAI's services.

Below is an example snapshot of the JSON input for the actor:

2    "SourceType": "Invoice",
3    "SourceLanguage": "ENG",
4    "DataStructures": [
5        {
6            "Name": "customer",
7            "Description": "Information about the customer",
8            "Fields": [
9                {
10                    "Name": "customer_name",
11                    "Description": "Name of the customer"
12                },
13                {
14                    "Name": "customer_address",
15                    "Description": "Address of the customer"
16                }
17            ]
18        },
19        {
20            "Name": "invoice_item",
21            "Description": "Details of each item in the invoice",
22            "Fields": [
23                {
24                    "Name": "item_description",
25                    "Description": "Description of the item"
26                },
27                {
28                    "Name": "item_quantity",
29                    "Description": "Quantity of the item"
30                },
31                {
32                    "Name": "item_price",
33                    "Description": "Price of the item"
34                }
35            ]
36        },
37        {
38            "Name": "invoice_summary",
39            "Description": "Summary of the invoice",
40            "Fields": [
41                {
42                    "Name": "total_amount",
43                    "Description": "Total amount of the invoice"
44                },
45                {
46                    "Name": "due_date",
47                    "Description": "Due date of the invoice"
48                }
49            ]
50        }
51    ],
52    "SourceFileUrl": "https://*********/invoice-example.png",
53    "OpenaiApiKey": "************"


Below is an example snapshot of the JSON output produced by the actor as a response to input example above:

2  "customer": {
3    "customer_name": "Bob Jones",
4    "customer_address": "1901 W Madison Street Chicago, IL 60612"
5  },
6  "invoice_item": [
7    {
8      "item_description": "Standard lawn care and maintenance. Inspection, mow, and edge. Weekly service.",
9      "item_quantity": "1",
10      "item_price": "$70.00"
11    },
12    {
13      "item_description": "Add trim, weed removal, fertilizer (as needed), and inspection.",
14      "item_quantity": "1",
15      "item_price": "$30.00"
16    },
17    {
18      "item_description": "Trimming of hedges on front of property.",
19      "item_quantity": "1",
20      "item_price": "$25.00"
21    }
22  ],
23  "invoice_summary": {
24    "total_amount": "$131.25",
25    "due_date": "Jan 27, 2022"
26  }

*please pay attention how output structure is controlled by input property DataStructures


While model used by is actor can be used in many situations, it is important to understand the limitations of it. Here are some of the limitations we are aware of:

  • Non-English: The model may not perform optimally when handling images with text of non-Latin alphabets, such as Japanese or Korean.
  • Small text: Enlarge text within the image to improve readability, but avoid cropping important details.
  • Rotation: The model may misinterpret rotated / upside-down text or images.
  • Visual elements: The model may struggle to understand graphs or text where colors or styles like solid, dashed, or dotted lines vary.
  • Spatial reasoning: The model struggles with tasks requiring precise spatial localization, such as identifying chess positions.
  • Accuracy: The model may generate incorrect descriptions or captions in certain scenarios.
  • Image shape: The model struggles with panoramic and fisheye images.
  • Metadata and resizing: The model doesn't process original file names or metadata, and images are resized before analysis, affecting their original dimensions.

For real-time examples and more detailed outputs, please refer to the Public run ID in the actor's Publication tab.


The "Image To Json Extractor" actor is built with precision and intelligence, ensuring high-quality data extraction. For further guidance on how to use this actor and to explore its full capabilities, check out the following resources:

For any questions or assistance, feel free to reach out to our support team.

Maintained by Community
Actor metrics
  • 7 monthly users
  • 100.0% runs succeeded
  • days response time
  • Created in Feb 2024
  • Modified 3 months ago