Nordstrom Scraper avatar

Nordstrom Scraper

Try for free

Pay $5.00 for 1,000 products

View all Actors
Nordstrom Scraper

Nordstrom Scraper

trudax/actor-nordstrom-scraper
Try for free

Pay $5.00 for 1,000 products

Nordstrom web scraper to crawl product information including price and sale price, color, and images. Extract all data in a dataset in multiple formats.

BT

Request failed with status code 400 with correct body request

Closed

bsq_tzam opened this issue
a year ago

I am using the Nordstrom Scraper and I am getting a response "Request failed with status code 400" which seems to be APIFY related.

This issue is very common and we were forced to add a stricter retry mechanism in order to avoid failing but we want to make sure there is no issue on our end. The request is populated with this body:

"data": "{\"startUrls\":[\"https://shop.nordstrom.com/s/olaplex-no-5-bond-maintenance-conditioner/5056281\",\"https://shop.nordstrom.com/s/shiseido-urban-environment-sun-dual-care-oil-free-mineral-broad-spectrum-sunscreen-spf-42/6739785\",\"https://shop.nordstrom.com/s/silkn-infinity-hair-removal-device/5376009\",\"https://shop.nordstrom.com/s/vapour-soft-focus-foundation/5850728\",\"https://shop.nordstrom.com/s/triple-peptide-cactus-oasis-serum/7166106\",\"https://shop.nordstrom.com/s/biggie-chain-necklace/5787614\",\"https://shop.nordstrom.com/s/gucci-the-alchemists-garden-the-last-day-of-summer-eau-de-parfum/5519933\",\"https://shop.nordstrom.com/s/charlotte-tilbury-airbrush-flawless-foundation/5368721?\",\"https://shop.nordstrom.com/s/acqua-di-parma-blu-mediterraneo-bergamotto-di-calabria-eau-de-toilette-spray/3269001\",\"https://shop.nordstrom.com/s/polished-pebble-leather-crossbody-bag-nordstrom-exclusive/7411875?\",\"ttps://www.nordstrom.com/s/mz-wallace-metro-crossbody-bag/5447687\",\"https://shop.nordstrom.com/s/gh-bass-larson-leather-penny-loafer-men/6541600\",\"https://shop.nordstrom.com/s/mini-weekender-bag/7574158... [trimmed]
trudax avatar

Seems like you are missing the country property used to get shipping information. Since it is a separate field, it is not related to the proxy settings and needs to be provided separately. I think in your case you can just use country: "United States".

BT

bsq_tzam

a year ago

Gustavo thank you for your answer. The version I am using is 1.2.24 which as far as I can see has the country property as optional.

Also, I would like to know if there are any change logs for older versions since I am planning to upgrade to 2.X version of this actor. I could only find change logs to 2.X here

trudax avatar

Thats is the chagelog, I just upgrated the Apify SDK and made some improvements. I beleive you should be able tonuse it with no problems. I recomend you to always use the latest version since that is the version I am always maintaining.

BT

bsq_tzam

a year ago

With the new version the availability property is always populated as null in contrast with older version where it is in the ApiProduct. Is there an issue with the stock status?

trudax avatar

Checking

trudax avatar

I was able to add the apiProduct in the results again but I don't see availability on the products I have tried. Can you check it out with the latest version if the data you want is in the extra.apiProduct property?

BT

bsq_tzam

a year ago

Propery apiProduct is not consistently in the responses, depending on the links. I could not find a pattern to help you out debugging this. Do you have any insights for this?

trudax avatar

Do you have a run id that resulted in products without apiProduct data?

BT

bsq_tzam

a year ago

Yes, you can check out run ID: QyEKZjqo6THJzKB3d Container URL: https://urhfhdqn75wd.runs.apify.net/

trudax avatar

I have updated the actor, removed the old availability and added a new property called "isAvailable". Please try it out and let me know if that works for you.

BT

bsq_tzam

a year ago

Thanks, I will check it and get back to you. Another thing I noticed is that the new version takes considerably more time. Check this dataset's log of the latest version ( Dataset ID: SjiLZerVokyfrlUG0 ) compared to the one we are using currently ( Dataset ID: 4aIFdbGrD0ahjRAMf ). The new crawler version takes about 10 minutes for 60 links and the old version lasts about a minute. Can you please check this out too?

a year ago

I have released a new version that should have improved speed when scraping product URLs.

BT

bsq_tzam

a year ago

I have attached a request body.

I have the same issue where I get 400 HTTP error Bad request.

I made 6 requests with the same body with a 1-minute difference, and all failed.

The requests took place at 5-11-2023 at 16:17 to 16:20 EET timezone.

As you can see the country is present in the body request and the actor version I have been using is 2.7.2.

trudax avatar

Can you share the run ID with me so I can take a look at the logs algo?

BT

bsq_tzam

a year ago

There is no ID since the task was never started.

trudax avatar

The list of "startUrls" is too big and Apify doesn't allow it. You can try to split into multiple runs with fewer startUrls on each or try to use a file as the source of startUrls.

BT

bsq_tzam

a year ago

It would be helpful to know what is the limit of this list since the way you define big is too abstract.

trudax avatar

Agreed. Unfortunately, I wasn't able to find that information on Apify. That information seems to be missing from this page: https://docs.apify.com/platform/limits. I would recommend you use a txt file where every line is a URL and upload it as the startUrls file.

BT

bsq_tzam

a year ago

The txt approach you provided is not documented on how to run an actor. Am I missing something?

trudax avatar

On Apify you have this option on the input section:

Developer
Maintained by Community

Actor Metrics

  • 4 monthly users

  • 3 stars

  • >99% runs succeeded

  • Created in Dec 2019

  • Modified 25 days ago

Categories