Amazon Product Scrapper avatar
Amazon Product Scrapper

Pricing

$3.80 / 1,000 results

Go to Store
Amazon Product Scrapper

Amazon Product Scrapper

Developed by

HappiTap

HappiTap

Maintained by Community

Amazon Product Scraper - extracts product details from Amazon product pages, search results, and category pages with structured data including title, ASIN, price, ratings, reviews, availability, seller, and more.

0.0 (0)

Pricing

$3.80 / 1,000 results

0

Total users

3

Monthly users

3

Runs succeeded

78%

Last modified

6 days ago

Amazon Product Scraper

An Apify actor that extracts product details from Amazon product pages, search results, and category pages with comprehensive structured data including availability, seller information, and more.

What It Does

This scraper extracts structured data from various Amazon pages including:

FieldDescription
titleProduct name/title
asinAmazon Standard Identification Number
priceCurrent product price
ratingAverage user rating (e.g., 4.5)
reviewCountTotal number of user reviews
availabilityIn stock / out of stock status
sellerSold by (seller name)
categoryMain category or breadcrumb
urlDirect product URL on Amazon
imageUrlProduct image thumbnail

Supported Page Types

  • Product Pages: Individual product detail pages (e.g., /dp/B08J8KJ9T3)
  • Search Results: Search query results (e.g., /s?k=wireless+earbuds)
  • Category Pages: Category browsing pages (e.g., /Best-Sellers-Electronics)
  • Best Sellers: Best sellers pages (e.g., /Best-Sellers/zgbs)

Use Cases

  • Product Research: Extract detailed product information for market analysis
  • Price Monitoring: Track product prices and availability across categories
  • Competitor Analysis: Monitor competitor products and pricing strategies
  • Inventory Tracking: Check product availability and seller information
  • E-commerce Data Collection: Gather comprehensive product catalogs

Input

The actor accepts the following input format:

{
"startUrls": [
{ "url": "https://www.amazon.com/s?k=wireless+earbuds" }
],
"maxItems": 50
}

Input Parameters

ParameterTypeRequiredDefaultDescription
startUrlsArrayYes-Array of objects with url property pointing to Amazon pages
maxItemsNumberNo50Maximum number of products to extract per page

Supported Amazon URLs

The scraper works with various Amazon page types:

Product Pages:

  • https://www.amazon.com/dp/B08J8KJ9T3
  • https://www.amazon.com/gp/product/B08J8KJ9T3

Search Results:

  • https://www.amazon.com/s?k=wireless+earbuds
  • https://www.amazon.com/s?k=laptop&ref=sr_pg_1

Category Pages:

  • https://www.amazon.com/Best-Sellers-Electronics/zgbs/electronics
  • https://www.amazon.com/Best-Sellers/zgbs

Output

The actor outputs structured data for each product found:

{
"title": "Apple AirPods (3rd Generation)",
"asin": "B08J8KJ9T3",
"price": "$169.00",
"rating": 4.7,
"reviewCount": 15600,
"availability": "In Stock",
"seller": "Amazon.com Services LLC",
"category": "Electronics > Headphones",
"url": "https://www.amazon.com/dp/B08J8KJ9T3",
"imageUrl": "https://m.media-amazon.com/images/I/61SUj2aKoEL._AC_UL320_.jpg",
"scrapedAt": "2024-01-01T00:00:00.000Z"
}

Example Usage

Search for Wireless Earbuds

{
"startUrls": [
{ "url": "https://www.amazon.com/s?k=wireless+earbuds" }
],
"maxItems": 50
}

Extract from Product Page

{
"startUrls": [
{ "url": "https://www.amazon.com/dp/B08J8KJ9T3" }
],
"maxItems": 1
}

Multiple Sources

{
"startUrls": [
{ "url": "https://www.amazon.com/s?k=laptop" },
{ "url": "https://www.amazon.com/Best-Sellers-Electronics/zgbs/electronics" }
],
"maxItems": 25
}

How It Works

  1. Page Type Detection: Automatically detects whether the URL is a product page, search results, or category page
  2. Appropriate Handler: Routes to the correct scraping handler based on page type
  3. Data Extraction: Uses specialized selectors for each page type to extract product information
  4. Comprehensive Fields: Extracts all required fields including availability and seller information
  5. Data Validation: Ensures only products with valid ASINs and titles are included
  6. Structured Output: Returns clean, structured data ready for analysis

Features

  • Multi-Page Support: Handles product pages, search results, and category pages
  • Robust Extraction: Multiple fallback selectors to handle Amazon's changing page structure
  • Stealth Mode: Uses Puppeteer with stealth plugins to avoid detection
  • Proxy Support: Built-in proxy rotation for reliable scraping
  • Error Handling: Graceful error handling with detailed logging
  • Data Validation: Ensures data quality with validation checks
  • Availability Tracking: Extracts stock status and seller information

Installation

  1. Clone this repository
  2. Install dependencies: npm install
  3. Run the actor: npm start

Development

  • npm start - Run the actor
  • npm run format - Format code with Prettier
  • npm run lint - Run ESLint
  • npm run lint:fix - Fix ESLint issues

Architecture

  • src/main.js - Main entry point and input validation
  • src/routes.js - Request routing and page type detection
  • src/handlers/amazonProductPage.js - Individual product page scraping logic
  • src/handlers/amazonSearchResults.js - Search results and category page scraping logic
  • src/puppeteerLauncher.js - Puppeteer browser configuration with stealth mode

Notes

  • The scraper is designed to be respectful of Amazon's servers and includes appropriate delays
  • Results may vary based on Amazon's page structure changes
  • The scraper automatically handles different Amazon page layouts and product formats
  • All extracted data is timestamped for tracking purposes
  • Product pages return single items, while search/category pages return multiple products