
Amazon Product Scrapper
Pricing
Pay per event

Amazon Product Scrapper
Amazon Product Scraper - extracts product details from Amazon product pages, search results, and category pages with structured data including title, ASIN, price, ratings, reviews, availability, seller, and more.
5.0 (1)
Pricing
Pay per event
0
47
24
Last modified
15 days ago
Amazon Product Scraper
An Apify actor that extracts product details from Amazon product pages, search results, and category pages with comprehensive structured data including availability, seller information, and more.
What It Does
This scraper extracts structured data from various Amazon pages including:
Field | Description |
---|---|
title | Product name/title |
asin | Amazon Standard Identification Number |
price | Current product price |
rating | Average user rating (e.g., 4.5) |
reviewCount | Total number of user reviews |
availability | In stock / out of stock status |
seller | Sold by (seller name) |
category | Main category or breadcrumb |
url | Direct product URL on Amazon |
imageUrl | Product image thumbnail |
Supported Page Types
- Product Pages: Individual product detail pages (e.g.,
/dp/B08J8KJ9T3
) - Search Results: Search query results (e.g.,
/s?k=wireless+earbuds
) - Category Pages: Category browsing pages (e.g.,
/Best-Sellers-Electronics
) - Best Sellers: Best sellers pages (e.g.,
/Best-Sellers/zgbs
)
Use Cases
- Product Research: Extract detailed product information for market analysis
- Price Monitoring: Track product prices and availability across categories
- Competitor Analysis: Monitor competitor products and pricing strategies
- Inventory Tracking: Check product availability and seller information
- E-commerce Data Collection: Gather comprehensive product catalogs
Input
The actor accepts the following input format:
{"startUrls": [{ "url": "https://www.amazon.com/s?k=wireless+earbuds" }],"maxItems": 50}
Input Parameters
Parameter | Type | Required | Default | Description |
---|---|---|---|---|
startUrls | Array | Yes | - | Array of objects with url property pointing to Amazon pages |
maxItems | Number | No | 50 | Maximum number of products to extract per page |
Supported Amazon URLs
The scraper works with various Amazon page types:
Product Pages:
https://www.amazon.com/dp/B08J8KJ9T3
https://www.amazon.com/gp/product/B08J8KJ9T3
Search Results:
https://www.amazon.com/s?k=wireless+earbuds
https://www.amazon.com/s?k=laptop&ref=sr_pg_1
Category Pages:
https://www.amazon.com/Best-Sellers-Electronics/zgbs/electronics
https://www.amazon.com/Best-Sellers/zgbs
Output
The actor outputs structured data for each product found:
{"title": "Apple AirPods (3rd Generation)","asin": "B08J8KJ9T3","price": "$169.00","rating": 4.7,"reviewCount": 15600,"availability": "In Stock","seller": "Amazon.com Services LLC","category": "Electronics > Headphones","url": "https://www.amazon.com/dp/B08J8KJ9T3","imageUrl": "https://m.media-amazon.com/images/I/61SUj2aKoEL._AC_UL320_.jpg","scrapedAt": "2024-01-01T00:00:00.000Z"}
Example Usage
Search for Wireless Earbuds
{"startUrls": [{ "url": "https://www.amazon.com/s?k=wireless+earbuds" }],"maxItems": 50}
Extract from Product Page
{"startUrls": [{ "url": "https://www.amazon.com/dp/B08J8KJ9T3" }],"maxItems": 1}
Multiple Sources
{"startUrls": [{ "url": "https://www.amazon.com/s?k=laptop" },{ "url": "https://www.amazon.com/Best-Sellers-Electronics/zgbs/electronics" }],"maxItems": 25}
How It Works
- Page Type Detection: Automatically detects whether the URL is a product page, search results, or category page
- Appropriate Handler: Routes to the correct scraping handler based on page type
- Data Extraction: Uses specialized selectors for each page type to extract product information
- Comprehensive Fields: Extracts all required fields including availability and seller information
- Data Validation: Ensures only products with valid ASINs and titles are included
- Structured Output: Returns clean, structured data ready for analysis
Features
- Multi-Page Support: Handles product pages, search results, and category pages
- Robust Extraction: Multiple fallback selectors to handle Amazon's changing page structure
- Stealth Mode: Uses Puppeteer with stealth plugins to avoid detection
- Proxy Support: Built-in proxy rotation for reliable scraping
- Error Handling: Graceful error handling with detailed logging
- Data Validation: Ensures data quality with validation checks
- Availability Tracking: Extracts stock status and seller information
Installation
- Clone this repository
- Install dependencies:
npm install
- Run the actor:
npm start
Development
npm start
- Run the actornpm run format
- Format code with Prettiernpm run lint
- Run ESLintnpm run lint:fix
- Fix ESLint issues
Architecture
src/main.js
- Main entry point and input validationsrc/routes.js
- Request routing and page type detectionsrc/handlers/amazonProductPage.js
- Individual product page scraping logicsrc/handlers/amazonSearchResults.js
- Search results and category page scraping logicsrc/puppeteerLauncher.js
- Puppeteer browser configuration with stealth mode
Notes
- The scraper is designed to be respectful of Amazon's servers and includes appropriate delays
- Results may vary based on Amazon's page structure changes
- The scraper automatically handles different Amazon page layouts and product formats
- All extracted data is timestamped for tracking purposes
- Product pages return single items, while search/category pages return multiple products