Perplexity.AI Actor avatar

Perplexity.AI Actor

Try for free

3 days trial then $30.00/month - No credit card required now

Go to Store
Perplexity.AI Actor

Perplexity.AI Actor

jons/perplexity-actor
Try for free

3 days trial then $30.00/month - No credit card required now

Use the Perplexity.ai Scraper to extract information with AI. For example: "Find hotels in Prague that offer free breakfast and have a nightly rate under 1000 CZK." Export the results into a structured dataset.

DN

Fails very often and takes quite awhile 2-4 minutes each run.

Closed

demonstrative_nomad opened this issue
3 months ago

2024-08-22T07:36:35.988Z INFO PuppeteerCrawler:Statistics: PuppeteerCrawler request statistics: {"requestAvgFailedDurationMillis":null,"requestAvgFinishedDurationMillis":null,"requestsFinishedPerMinute":0,"requestsFailedPerMinute":0,"requestTotalDurationMillis":0,"requestsTotal":0,"crawlerRuntimeMillis":60690,"retryHistogram":[]} 2024-08-22T07:36:36.052Z INFO PuppeteerCrawler:AutoscaledPool: state {"currentConcurrency":1,"desiredConcurrency":1,"systemStatus":{"isSystemIdle":true,"memInfo":{"isOverloaded":false,"limitRatio":0.2,"actualRatio":0},"eventLoopInfo":{"isOverloaded":false,"limitRatio":0.6,"actualRatio":0},"cpuInfo":{"isOverloaded":false,"limitRatio":0.4,"actualRatio":0.173},"clientInfo":{"isOverloaded":false,"limitRatio":0.3,"actualRatio":0}}} 2024-08-22T07:36:39.733Z WARN PuppeteerCrawler: Reclaiming failed request back to the list or queue. Timeout Error: waiting for function failed: timeout of 20000ms exceeded. 2024-08-22T07:36:39.735Z {"id":"ZotIN5GPCzaMgM8","url":"https://www.perplexity.ai/search?q=Evalute%20the%20outcome%20of%20the%20following%20statement%3A%20Will%20the%20SCOTUS%20affirmative%20action%20ruling%20result%20in%20a%20universal%20ban%3F%2C%20made%20in%202023-07-10","retryCount":1} 2024-08-22T07:37:33.514Z WARN PuppeteerCrawler: Reclaiming failed request back to the list or queue. Timeout Error: waiting for function failed: timeout of 20000ms exceeded. 2024-08-22T07:37:33.517Z {"id":"ZotIN5GPCzaMgM8","url":"https://www.perplexity.... [trimmed]

DN

demonstrative_nomad

3 months ago

retrying the same query by running the actor again works, but hoping for a better solution thats more reliable and ideally faster too. as cost incur for failing is not low as well.

jons avatar

Jon (jons)

3 months ago

Thanks for report the issue! Seem they change their site, just now I fixed it. Please try again with latest version 0.0.24.

DN

demonstrative_nomad

3 months ago

Thanks it works better now, runs are now ~1 min+. Most of my initial runs work but at the end there's some weird behavior where it claims multiple results are being returned but viewing it only shows 1. there are a few failures near the end as well not sure why,

Here's an example of one of the fails below 2024-08-22T18:24:06.523Z ACTOR: Pulling Docker image of build HMr6kFCNLcnClmFPF from repository. 2024-08-22T18:24:08.559Z ACTOR: Creating Docker container. 2024-08-22T18:24:12.155Z ACTOR: Starting Docker container. 2024-08-22T18:24:13.535Z Starting X virtual framebuffer using: Xvfb :99 -ac -screen 0 1920x1080x24+32 -nolisten tcp 2024-08-22T18:24:13.541Z Executing main command 2024-08-22T18:24:14.534Z INFO System info {"apifyVersion":"3.2.4","apifyClientVersion":"2.9.4","crawleeVersion":"3.11.1","osType":"Linux","nodeVersion":"v18.20.4"} 2024-08-22T18:24:18.459Z INFO Configuring Web Scraper. 2024-08-22T18:24:42.187Z WARN 2024-08-22T18:24:42.188Z ***************************************************************** 2024-08-22T18:24:42.190Z * Web Scraper is running in DEVELOPMENT MODE! * 2024-08-22T18:24:42.192Z * Concurrency is limited, sessionPool is not available, * 2024-08-22T18:24:42.193Z * timeouts are increased and debugger is enabled. * 2024-08-22T18:24:42.195Z * If you want full control and performance switch * 2024-08-22T18:24:42.196Z * Run type to PRODUCTION! * 202... [trimmed]

jons avatar

Jon (jons)

3 months ago

Hi! Could you please give the inputs that make the tasks failed, so that I can check the issue easily.

jons avatar

Jon (jons)

3 months ago

I just do some test and it work fine. Some tips:

  1. Add multiple search terms into one task: this will run faster then do one by one. Remember to increase timeout if need
  2. Always using residential proxy
DN

demonstrative_nomad

3 months ago

I'll will try multiple search terms and see how it goes. Thanks!

Developer
Maintained by Community

Actor Metrics

  • 13 monthly users

  • 8 stars

  • 94% runs succeeded

  • 19 days response time

  • Created in Aug 2024

  • Modified 3 months ago

Categories