GPT Scraper avatar

GPT Scraper

Try for free

Pay $9.00 for 1,000 pages

View all Actors
GPT Scraper

GPT Scraper

drobnikj/gpt-scraper
Try for free

Pay $9.00 for 1,000 pages

Extract data from any website and feed it into GPT via the OpenAI API. Use ChatGPT to proofread content, analyze sentiment, summarize reviews, extract contact details, and much more.

Do you want to learn more about this Actor?

Get a demo
EW

Incomplete results

Closed

enchanting_wilderness opened this issue
8 months ago

This actor is exactly what I need, but the results I get seems to incomplete. I have a free account, but I doubt this influences the result. My use-case is very simple. I want a list of all casino's mentioned on a website, but it seems to list only a small number, maximum 3 in one case and maximum 10 in another.

paja avatar

Hi,

thanks for reaching out, we'll look into it and let you know what can be done.

EW

enchanting_wilderness

8 months ago

I hope to hear about it soon @pavlina.

lukas.prusa avatar

Hi Jasper,

the website you are trying to scrape is very large and can't fully fit into the GPT context window, as indicated by the warn log messages "Content was truncated for...".

There are two options to resolve this:

  1. the easier option is to use the Extended GPT Scraper with your own API key. You can select a model with a larger context window and fit the whole web page into it (e.g. gpt-4-turbo-preview).
  2. use our page processing filters to decrease the page content being sent to GPT. This is a more advanced way and requires knowledge of CSS and HTML selectors.
    • Selecting the newly added removeLinkUrls option and adding img to removeElementsCssSelector should help you get more results, though probably still not all of them...

I hope this helps, happy scraping!

EW

enchanting_wilderness

8 months ago

Thanks Lukáš, this helps a lot!

Developer
Maintained by Apify

Actor Metrics

  • 155 monthly users

  • 64 stars

  • >99% runs succeeded

  • 2 days response time

  • Created in Mar 2023

  • Modified 4 days ago