Extended GPT Scraper avatar

Extended GPT Scraper

Try for free

No credit card required

View all Actors
Extended GPT Scraper

Extended GPT Scraper

drobnikj/extended-gpt-scraper
Try for free

No credit card required

Extract data from any website and feed it into GPT via the OpenAI API. Use ChatGPT to proofread content, analyze sentiment, summarize reviews, extract contact details, and much more.

Do you want to learn more about this Actor?

Get a demo
TH

chatGPT API not completing response

Closed

thegenie opened this issue
5 months ago

After a few weeks of a lot of trial and error I think the problem I'm seeing is related to the API not completing it's full response. I believe it's getting cut short, truncated or something stopping it fully running. I see this happening in the API playground.

  • I'm under the token limits, albeit I can't see a max token option.
  • My API account has sufficient balance.
  • GPT 4 Turbo has a large enough context window to ingest the markdown tokens

The issue is inconsistent and usually partial responses with the JSON output. Also, "instructions" and "json schema description" simply don't work (I've spent days figuring this out). I simply just put "follow schema description" to achieve better results. The description needs to be embodied in the JSON object like this:

"ecoFriendliness": { "type": "boolean", "description": "Set to true if the product is marketed as eco-friendly or made from sustainable/recycled materials. Omit the property if not applicable. Only include in JSON if "true"." }

Anyway, same run here for the same URL outputs with different fields. There should be a lot more fields filled out based on what I'm requesting it to do and which I now it's capable of doing.

Run ID's: CURqEqs4Y7P5Ns25Z

RkbutxJH4YfsKbQxE

lukas.prusa avatar

Hi, thanks a lot for reporting this! Essentially, this is a problem on OpenAI's part.

LLM AIs are dumb and work quite unreliably. The schema function callings that OpenAI provides are just some fancy prompt engineering on their part, it's nothing smart nor genius. See this thread where people complain that even fields set as required in the schema, get randomly ignored:

Here is the documentation for OpenAI's functions, it speaks for itself:

The latest models (gpt-4o, gpt-4-turbo, and gpt-3.5-turbo) have been trained to both detect when a function should to be called (depending on the input) and to respond with JSON that adheres to the function signature more closely than previous models. With this capability also comes potential risks. We strongly recommend building in user confirmation flows before taking actions that impact the world on behalf of users (sending an email, posting something online, making a purchase, etc).

TLDR: there is no easy fix for this, you will just have to write a better prompt…

Generally, I would recommend not using “optional” and omittable properties. If you are using a boolean, then just stick to it and use both false and true, this will make it more simple for the stupid AI to understand. Same for the array of ages you are using, just let it output all of them and keep even the ones that got 0. You can always filter these values out in your own data processing step and not rely on some faulty statistical model. Also, I would recommend you to use temperature set to 0 for consistent results.

I hope this helps, good luck, thanks and happy scraping!

Developer
Maintained by Apify
Actor metrics
  • 87 monthly users
  • 34 stars
  • 99.1% runs succeeded
  • 4.4 days response time
  • Created in Jun 2023
  • Modified 12 days ago