GPT Scraper avatar
GPT Scraper

Pricing

$9.00 / 1,000 pages

Go to Store
GPT Scraper

GPT Scraper

Developed by

Jakub Drobník

Jakub Drobník

Maintained by Apify

Extract data from any website and feed it into GPT via the OpenAI API. Use ChatGPT to proofread content, analyze sentiment, summarize reviews, extract contact details, and much more.

4.4 (7)

Pricing

$9.00 / 1,000 pages

103

Total users

5.9K

Monthly users

150

Runs succeeded

>99%

Last modified

5 months ago

MH

Json schema is ignored and invalid output is generated

Closed

maroon_herb opened this issue
2 years ago

I have a json schema with: "arrayName": { "type": "array", "minItems": 1, "items": { "type": "object", "properties": {

But it still generates: "arrayName": [] I have also instructed GPT to: If it isn't possible to generate at least one record in arrayName answer with 'skip this page'

drobnikj avatar

Hey Markus,

the reason is that if you require formatted output, the GPT cannot answer you with text. So the "skip this page" will not work here.

You need to process the output and filtered out the items with empty arrays.

Have a nice day!

MH

maroon_herb

2 years ago

Hi Jakub,,

That explains the "skip this page" part, but still not whyI get an empty array when that isn't allowed in the schema. "minItems": 1 means that an empty array isn't valid.

drobnikj avatar

You are right, currently, we do not validate the output against the schema, we just check if it is JSON. The schema is mainly made for defining the format of the output.

We can add validation, but we will provide this option in pay as you go version of the scraper https://apify.com/drobnikj/extended-gpt-scraper.

MH

maroon_herb

2 years ago

That sounds like a good solution. In the meantime, add information that the schema isn't used for validation since "If true, the answer will be transformed into a structured format based on the schema in the jsonAnswer attribute." indicates that it's going to adhere to the schema. :)