Galaxus API Scraper avatar
Galaxus API Scraper
Try for free

30 days trial then $30.00/month - No credit card required now

View all Actors
Galaxus API Scraper

Galaxus API Scraper

petr_cermak/galaxus-scraper
Try for free

30 days trial then $30.00/month - No credit card required now

User avatar

Unable to perform searches

Open

stuff opened this issue
a year ago

Getting crawler errors when peforming requests: 2023-06-04T19:28:21.809Z ACTOR: Pulling Docker image from repository. 2023-06-04T19:28:23.216Z ACTOR: Creating Docker container. 2023-06-04T19:28:23.432Z ACTOR: Starting Docker container. 2023-06-04T19:28:25.815Z INFO System info {"apifyVersion":"3.1.1","apifyClientVersion":"2.6.2","crawleeVersion":"3.1.3","osType":"Linux","nodeVersion":"v16.20.0"} 2023-06-04T19:28:26.482Z INFO CheerioCrawler: Starting the crawl 2023-06-04T19:28:35.364Z ERROR CheerioCrawler: Request failed and reached maximum retries. Error: Resource https://www.galaxus.de/api/graphql/pdp-get-product-details served Content-Type application/octet-stream, but only text/html, text/xml, application/xhtml+xml, application/xml, application/json are allowed. Skipping resource. 2023-06-04T19:28:35.365Z at CheerioCrawler._abortDownloadOfBody (/usr/src/app/node_modules/@crawlee/http/internals/http-crawler.js:538:19) 2023-06-04T19:28:35.365Z at HttpCrawler.postNavigationHooks (/usr/src/app/node_modules/@crawlee/http/internals/http-crawler.js:241:45) 2023-06-04T19:28:35.370Z at CheerioCrawler._executeHooks (/usr/src/app/node_modules/@crawlee/basic/internals/basic-crawler.js:834:23) 2023-06-04T19:28:35.370Z at CheerioCrawler._handleNavigation (/usr/src/app/node_modules/@crawlee/http/internals/http-crawler.js:334:20) 2023-06-04T19:28:35.370Z at processTicksAndRejections (node:internal/process/task_queues:96:5) 2023-06-04T19:28:35.372Z at async CheerioCrawler._runRequestHandler (/usr/src/app/node_modules/@crawlee/http/internals/http-crawler.js:286:13) 2023-06-04T19:28:35.372Z at async wrap (/usr/src/app/node_modules/@apify/timeout/index.js:52:21) {"id":"tzTMX8ZfSNf4ruf","url":"https://www.galaxus.de/api/graphql/pdp-get-product-details","method":"POST","uniqueKey":"POST(Zb03tGiv):https://www.galaxus.de/api/graphql/pdp-get-product-details"} 2023-06-04T19:28:35.406Z INFO CheerioCrawler: All the requests from request list and/or request queue have been processed, the crawler will shut down. 2023-06-04T19:28:35.637Z INFO CheerioCrawler: Crawl finished. Final request statistics: {"requestsFinished":0,"requestsFailed":1,"retryHistogram":[1],"requestAvgFailedDurationMillis":8849,"requestAvgFinishedDurationMillis":null,"requestsFinishedPerMinute":0,"requestsFailedPerMinute":6,"requestTotalDurationMillis":8849,"requestsTotal":1,"crawlerRuntimeMillis":9399} 2023-06-04T19:28:35.637Z INFO CheerioCrawler: Error analysis: {"totalErrors":1,"uniqueErrors":1,"mostCommonErrors":["1x: Resource https://www.galaxus.de/api/graphql/pdp-get-product-details served Content-Type application/octet-stream, but only text/html, text/xml, application/xhtml+xml, application/xml, application/json are allowed. Skipping resource. (/usr/src/app/node_modules/@crawlee/http/internals/http-crawler.js:538:19)"]} 2023-06-04T19:28:35.638Z INFO Actor finished successfully (exit code 0)

User avatar

Thanks for the report, however I cannot seem to reproduce it. Can you share your full actor input that triggered this issue?

Developer
Maintained by Community
Actor metrics
  • 3 monthly users
  • 100.0% runs succeeded
  • 74 days response time
  • Created in Jan 2023
  • Modified about 2 months ago
Categories