Free Amazon Product Scrapper avatar
Free Amazon Product Scrapper

Pricing

$15.00/month + usage

Go to Store
Free Amazon Product Scrapper

Free Amazon Product Scrapper

Developed by

Sovanza

Sovanza

Maintained by Community

Scrape Amazon product data using URLs or ASINs. Extract price, stock, reviews, ratings, and more. Ideal for eCommerce research, pricing analysis, and competitor tracking. JSON/CSV output.

5.0 (13)

Pricing

$15.00/month + usage

1

Total users

12

Monthly users

4

Runs succeeded

24%

Last modified

a day ago

Amazon Product Scraper

What is Amazon Product Scraper and How Does It Work?

Amazon Product Scraper is a powerful web scraping tool that extracts product data from Amazon using product URLs. It collects product details like price, stock availability, ratings, reviews, seller info, and more — all exportable in JSON or CSV format.

Why Use Amazon Product Scraper?

Use this scraper to:

  • Monitor pricing and stock for competitor products
  • Track product ratings and reviews for market analysis
  • Benchmark product performance within categories
  • Automate data extraction for eCommerce analytics and advertising optimization

Features

  • Scrapes product details from Amazon product URLs
  • Supports both single URL and multiple URLs as input
  • Extracts comprehensive product data:
    • Title
    • Price
    • Images (all product images)
    • Variants and their prices
    • Ratings (average rating)
    • Reviews (configurable number of reviews)
    • Product details and specifications
    • Product description
  • Smart proxy rotation to avoid IP blocking
  • User-agent randomization for better reliability
  • Automatic retry mechanism for failed requests
  • Support for multiple Amazon regional domains

How to Use Amazon Product Scraper on Apify

Using the Actor

To use this scraper on Apify, follow these simple steps:

  1. Go to the Amazon Product Scraper on Apify platform

  2. Input Configuration:

    • Enter one or more Amazon product URLs you want to scrape
    • Configure which data components to scrape:
      • Product reviews
      • Product variants/options
      • Detailed product specifications
      • Seller information
    • Select language settings if needed
    • Choose proxy country or use auto-select

    Input Configuration

The actor accepts the following input parameters:

{
"productUrls": [
"https://www.amazon.com/product-url-1",
"https://www.amazon.com/product-url-2"
],
"scrapeReviews": true,
"scrapeProductVariants": true,
"scrapeProductDetails": true,
"scrapeSellerInfo": true,
"language": "en",
"proxyCountry": "AUTO_SELECT_PROXY_COUNTRY"
}
  • productUrls (required): One or more Amazon product URLs to scrape (array of URLs).
  • scrapeReviews (optional): Whether to scrape product reviews (default: false).
  • scrapeProductVariants (optional): Whether to scrape product variants/options (default: false).
  • scrapeProductDetails (optional): Whether to scrape detailed product specifications (default: false).
  • scrapeSellerInfo (optional): Whether to scrape seller information (default: false).
  • language (optional): Language to use on Amazon (default: "en", options include: en, de, es, fr, it, ja, zh_CN, pt, nl, pl, tr, ar, sv, ko, hi, cs, da, he, ru, th).
  • proxyCountry (optional): Country for proxy (default: "AUTO_SELECT_PROXY_COUNTRY", options include: US, GB, DE, FR, JP, CA, IT).
  1. Run the Actor:

    • Click the "Start" button to begin scraping
    • The actor will process each URL and extract the requested data
  2. Access Your Results:

    • Once complete, view your results in the "Dataset" tab
    • Download the data in your preferred format (JSON, CSV, Excel, etc.)
    • Alternatively, access the data via the Apify API
  3. Schedule Regular Runs (Optional):

    • Set up scheduled runs to monitor products over time
    • Configure webhooks to receive notifications when runs complete

Output

The actor outputs a dataset with detailed information about each product. Each item in the dataset contains:

  • url: The URL of the product page
  • title: Product title
  • price: Current product price
  • images: Array of product image URLs
  • details: Object containing product specifications and details
  • description: Product description
  • average_rating: Average product rating (out of 5)
  • review_count: Total number of reviews
  • variants: Array of product variants with their names and prices
  • reviews: Array of product reviews, each containing:
    • title: Review title
    • rating: Individual review rating
    • date: Review date
    • text: Review content
    • reviewer: Reviewer name

How the Scraper Works

The Amazon Product Scraper uses Playwright, a modern headless browser automation library, to navigate Amazon product pages and extract data. Here's how it works:

  1. Browser Automation: The actor launches a headless browser instance that mimics real user behavior when visiting Amazon pages.

  2. Data Extraction Process:

    • Navigates to each product URL
    • Waits for critical page elements to load
    • Uses both DOM parsing and JavaScript evaluation to extract product information
    • Handles different page layouts and element structures across various Amazon domains
    • Extracts structured data including product details, prices, images, and reviews
  3. Variant Handling: The scraper can detect and extract information about product variants (sizes, colors, models) by:

    • Identifying variant selectors on the page
    • Simulating clicks on variant options
    • Capturing price and availability changes for each variant
  4. Review Extraction: For reviews, the scraper:

    • Navigates to the review section
    • Paginates through reviews up to the configured maximum
    • Extracts review text, ratings, dates, and reviewer information
  5. Error Handling: The actor implements robust error handling with automatic retries for failed requests, timeouts, and navigation errors.

Anti-blocking Measures

This actor uses several techniques to avoid being blocked by Amazon:

  • Random user agent rotation: Cycles through different browser user-agent strings to avoid detection
  • Headless browser with realistic behavior: Simulates human-like navigation patterns and timing
  • Proper request throttling and timing: Adds random delays between actions to avoid triggering rate limits
  • Smart proxy management: Can use different proxies based on the target Amazon domain
  • Session management: Maintains consistent sessions to appear as a legitimate user

Performance Optimization

The actor is optimized for both speed and reliability:

  • Parallel processing of multiple URLs (when multiple URLs are provided)
  • Efficient memory management for processing large numbers of products
  • Selective data extraction based on configuration options to improve performance

Limitations

  • The actor may not work for all Amazon regional domains
  • Some product details may vary depending on the product category
  • Amazon's website structure changes frequently, which may require updates to the scraping logic

License

This project is licensed under the MIT License - see the LICENSE file for details.