data:image/s3,"s3://crabby-images/5935f/5935fc0f4173d9741677f9438d67d7776ad0a6da" alt="GPT Scraper avatar"
GPT Scraper
Pay $9.00 for 1,000 pages
data:image/s3,"s3://crabby-images/5935f/5935fc0f4173d9741677f9438d67d7776ad0a6da" alt="GPT Scraper"
GPT Scraper
Pay $9.00 for 1,000 pages
Extract data from any website and feed it into GPT via the OpenAI API. Use ChatGPT to proofread content, analyze sentiment, summarize reviews, extract contact details, and much more.
Incomplete results
This actor is exactly what I need, but the results I get seems to incomplete. I have a free account, but I doubt this influences the result. My use-case is very simple. I want a list of all casino's mentioned on a website, but it seems to list only a small number, maximum 3 in one case and maximum 10 in another.
data:image/s3,"s3://crabby-images/46966/46966121d21072470651cefbb26d96ac80e1e7b7" alt="paja avatar"
Hi,
thanks for reaching out, we'll look into it and let you know what can be done.
enchanting_wilderness
I hope to hear about it soon @pavlina.
data:image/s3,"s3://crabby-images/fe29e/fe29e977b32f9112a0d68f6247bf95347706c6f1" alt="lukas.prusa avatar"
Hi Jasper,
the website you are trying to scrape is very large and can't fully fit into the GPT context window, as indicated by the warn
log messages "Content was truncated for...".
There are two options to resolve this:
- the easier option is to use the Extended GPT Scraper with your own API key. You can select a model with a larger context window and fit the whole web page into it (e.g.
gpt-4-turbo-preview
). - use our page processing filters to decrease the page content being sent to GPT. This is a more advanced way and requires knowledge of CSS and HTML selectors.
- Selecting the newly added
removeLinkUrls
option and addingimg
toremoveElementsCssSelector
should help you get more results, though probably still not all of them...
- Selecting the newly added
I hope this helps, happy scraping!
enchanting_wilderness
Thanks Lukáš, this helps a lot!
Actor Metrics
148 monthly users
-
78 bookmarks
>99% runs succeeded
3.2 days response time
Created in Mar 2023
Modified a month ago