Extended GPT Scraper avatar
Extended GPT Scraper
Try for free

No credit card required

View all Actors
Extended GPT Scraper

Extended GPT Scraper

drobnikj/extended-gpt-scraper
Try for free

No credit card required

Extract data from any website and feed it into GPT via the OpenAI API. Use ChatGPT to proofread content, analyze sentiment, summarize reviews, extract contact details, and much more.

Do you want to learn more about this Actor?

Get a demo
TL

4o mini model (for cost concern)

Closed

tak_lai opened this issue
a month ago

Hi, since website scraping involved large input token and out token, any plans to add model 4o mini soon? Many thanks

lukas.prusa avatar

Hi Tak, thanks opening this issue and your suggestion!

Yes, we will 100% add the 4o-mini model shortly :) We will also most likely add it to our Pay Per Result version of this Actor.

I will keep you updated here, thanks!

lukas.prusa avatar

Hi again, thanks for your patience!

We've had some other issues with the scraper, so we kept in the beta for a bit too long now. The 4o-mini model is accessible on the latest (default) version :) We've also set it as the default one for the Pay Per Result Actor version.

Try it out and let me know how it works, thanks!

TL

tak_lai

24 days ago

Thanks , I will try it. Really appreciate your effort

TL

tak_lai

23 days ago

Hello , I just spotted another issue, which may relate to playwright. I tried to use playwright and selenium to scrape https://www.newbalance.com/pd/made-in-usa-990v6/U990V6-45189.html I am not sure why it is failed when using playwright. Since Extended GPT Scraper use playwright so it shared same result(retrieve blank page). It works using selenium in my local but failed to use playwright in my local. I just want to share this limitation as it is not the problem on this actor. I am truly thankful of your effort on this actor

lukas.prusa avatar

Hi, thanks for sharing this inside. This looks to be some issue in playwright, or at least it's getting detected and blocked by the website.

I've tried a few settings like residential proxies and waiting longer for the dynamic content to load, but I guess it's still getting blocked... This looks a bit out of our scope as this is most likely an issue somewhere else, so unfortunately we won't be fixing this. But good luck with the website and I hope you will be able to scrape it well with selenium :)

Developer
Maintained by Apify
Actor metrics
  • 75 monthly users
  • 26 stars
  • 99.2% runs succeeded
  • 7.6 days response time
  • Created in Jun 2023
  • Modified 24 days ago