GPT Scraper avatar

GPT Scraper

Try for free

Pay $9.00 for 1,000 pages

Go to Store
GPT Scraper

GPT Scraper

drobnikj/gpt-scraper
Try for free

Pay $9.00 for 1,000 pages

Extract data from any website and feed it into GPT via the OpenAI API. Use ChatGPT to proofread content, analyze sentiment, summarize reviews, extract contact details, and much more.

Do you want to learn more about this Actor?

Get a demo
JoseJet avatar

Feature Request: Include information about the ratio between Generated and Sent Content to the dataset

Open

Pepa <b>J</b> (JoseJet) opened this issue
9 days ago

For websites with a lot of content, the information in sentContent doesn't include all the information from the page.

Would it be possible to introduce some indicator of not the content from whole page was used?

Possibly some ratio of value between 0 and 1 - like 0.64 = 64% of the generated markdown was sent/used for the prompt.

This would help us to investigate issues related to the content was cutoff in the middle and therefore right results were not provided to the dataset.

lukas.prusa avatar

Hi, thanks for the suggestion!

This seems like a reasonable feature to be added into the scraper :) We will add it in.

I will keep you updated here, thanks!

Developer
Maintained by Apify

Actor Metrics

  • 148 monthly users

  • 69 stars

  • >99% runs succeeded

  • 2.6 days response time

  • Created in Mar 2023

  • Modified 8 days ago