Extended GPT Scraper
No credit card required
Extended GPT Scraper
No credit card required
Extract data from any website and feed it into GPT via the OpenAI API. Use ChatGPT to proofread content, analyze sentiment, summarize reviews, extract contact details, and much more.
Do you want to learn more about this Actor?
Get a demoAfter a few weeks of a lot of trial and error I think the problem I'm seeing is related to the API not completing it's full response. I believe it's getting cut short, truncated or something stopping it fully running. I see this happening in the API playground.
- I'm under the token limits, albeit I can't see a max token option.
- My API account has sufficient balance.
- GPT 4 Turbo has a large enough context window to ingest the markdown tokens
The issue is inconsistent and usually partial responses with the JSON output. Also, "instructions" and "json schema description" simply don't work (I've spent days figuring this out). I simply just put "follow schema description" to achieve better results. The description needs to be embodied in the JSON object like this:
"ecoFriendliness": { "type": "boolean", "description": "Set to true if the product is marketed as eco-friendly or made from sustainable/recycled materials. Omit the property if not applicable. Only include in JSON if "true"." }
Anyway, same run here for the same URL outputs with different fields. There should be a lot more fields filled out based on what I'm requesting it to do and which I now it's capable of doing.
Run ID's: CURqEqs4Y7P5Ns25Z
RkbutxJH4YfsKbQxE
Hi, thanks a lot for reporting this! Essentially, this is a problem on OpenAI's part.
LLM AIs are dumb and work quite unreliably. The schema function callings that OpenAI provides are just some fancy prompt engineering on their part, it's nothing smart nor genius. See this thread where people complain that even fields set as required
in the schema, get randomly ignored:
Here is the documentation for OpenAI's functions, it speaks for itself:
The latest models (gpt-4o, gpt-4-turbo, and gpt-3.5-turbo) have been trained to both detect when a function should to be called (depending on the input) and to respond with JSON that adheres to the function signature more closely than previous models. With this capability also comes potential risks. We strongly recommend building in user confirmation flows before taking actions that impact the world on behalf of users (sending an email, posting something online, making a purchase, etc).
TLDR: there is no easy fix for this, you will just have to write a better prompt…
Generally, I would recommend not using “optional” and omittable properties. If you are using a boolean, then just stick to it and use both false
and true
, this will make it more simple for the stupid AI to understand. Same for the array of ages you are using, just let it output all of them and keep eve... [trimmed]
Actor Metrics
79 monthly users
-
46 stars
>99% runs succeeded
5.8 days response time
Created in Jun 2023
Modified 6 days ago