Amazon Scraper avatar

Amazon Scraper

Try for free

Pay $10.00 for 1,000 results

Go to Store
Amazon Scraper

Amazon Scraper

junglee/free-amazon-product-scraper
Try for free

Pay $10.00 for 1,000 results

Gets you product data from Amazon. Unofficial API. Scrapes and downloads product information without using the Amazon API, including reviews, prices, descriptions, and ASIN.

Do you want to learn more about this Actor?

Get a demo
SI

Duplicate issue when scraping via the API

Closed

sinclairgsm opened this issue
15 days ago

Hello,

First of all, thank you for your work. I am using the scraper via the API, and I am encountering an issue: several products that have already been scraped are appearing again. This becomes problematic when I scrape around 100 products from a page and then scrape another 100 products from the same page, resulting in duplicates.

As a beginner, I would like to know if it is possible to configure the scraper to avoid these duplicates.

Thank you in advance for your response.

lukas.prusa avatar

Hi, thanks for opening this issue!

Unfortunately, duplicates are a part of web scraping and are almost impossible to mitigate. Do you want to simply filter them out, or to not scrape them at all? A simple solution would be to use a tool like the Duplications Checker and filter them by each product's ASIN.

If you want to not scrape them at all (essentially, not waste any credits on them) then that is sadly not possible. Simply put, there is not such a functionality on Amazon, so we are forced to search their pages "aimlessly".

I hope this helps, thanks and happy scraping!

SI

sinclairgsm

13 days ago

Thank you for your reply. I will filter the scraping results to avoid duplicates in my database. Have a nice day !

lukas.prusa avatar

Great, that's how it should be done properly using a database ;) Good luck and have a nice day too!

Developer
Maintained by Apify

Actor Metrics

  • 677 monthly users

  • 72 stars

  • 98% runs succeeded

  • 16 hours response time

  • Created in May 2022

  • Modified 4 days ago

Categories