Macy's Scraper avatar

Macy's Scraper

Try for free

Pay $9.00 for 1,000 results

View all Actors
Macy's Scraper

Macy's Scraper

trudax/actor-macys-scraper
Try for free

Pay $9.00 for 1,000 results

Macy's web scraper to crawl product information including price and sale price, color, and images. Extract all data in a dataset in structured formats.

82

82society

2 years ago

Also Run ID f2lwGO06ELesgyJrS and deuQAydeWf0rgKT0J are having the same issue. Some pages are scraped from mens.

82

82society

2 years ago

Run ID: sLTYaK3ytuzSbQa6n I set the item https://www.macys.com/shop/mens-clothing/all-mens-clothing/Pageindex/881?id=197651 Page 881. However, it's the from link that it started scraping was from https://www.macys.com/shop/mens-clothing/all-mens-clothing/Pageindex/1267?id=197651 I haven't check everything yet. But so far, every run I checked, it's got a weird pattern.

trudax avatar

Those are failed urls from previous runs.

82

82society

2 years ago

"those are failed URLs from previous runs" Question: 1. If I set https://www.macys.com/shop/womens-clothing/all-womens-clothing/Pageindex/246?id=188851 (women's page 246), why is it attempting to scrape from men's page 1214?

Run ID: v9PI0NTpbxve7G6Nu

I ran this task from page 600 and it only obtain 16 results. Question: 2. do I get charged for running a task when there are failed urls due to collected previously? 3. I recall you mentioning that sometime the Run stops as Succeeded maybe because Macy's page may be blocking it from proceeding. Is there a way to fix or bypass that?

trudax avatar

If you run once for all-mens-clothing, all failed url for this run will be stored to be retryed in the next run. Then the second run you run for all-womens-clothing but the previous failed URLs from all-mens-clothing are also added to the queue. I will take a closer look at this run, seems that some products are returning error and are not being scraped. I didn't understood your second question, Apify charges for the use of resources spent running the actor. Is not possible to bypass 100% all the anti-scraping set by any website, you need to retry the request with different sessions (which Apify does automatically) to eventually bypass it.

trudax avatar

There was a product page with a collection that wasn't being scrapped since the layout was composed of multiple products. I have added this new layout to the actor and the product from this URLs will be also scrapped now.

Developer
Maintained by Community

Actor Metrics

  • 2 monthly users

  • 4 stars

  • 88% runs succeeded

  • Created in Dec 2019

  • Modified 4 days ago

Categories