Awesome Amazon Product Scraper
3 days trial then $5.00/month - No credit card required now
Awesome Amazon Product Scraper
3 days trial then $5.00/month - No credit card required now
Amazon Product Scraper Actor retrieves detailed product data from Amazon search and product pages. It supports configurable limits, proxy usage, and flexible image extraction modes for efficient product research and price tracking.
Amazon Product Scraper
A powerful and efficient web scraper built on the Apify platform that extracts detailed product data from Amazon search results and product pages. Perfect for price monitoring, product research, and market analysis.
Features
-
Versatile URL Handling
- Process Amazon search result pages
- Extract data from individual product pages
- Support for both search queries and direct product URLs
-
Smart Image Extraction
- Basic Mode: High-quality main product image
- Advanced Mode: All available product images with best resolution
- Automatic image quality selection
-
Comprehensive Data Extraction
- Product title and detailed description
- Current price with currency information
- Real-time availability status
- ASIN identifier
- Customer ratings and review counts
- Product features and bullet points
- High-resolution product images
- Timestamps for data freshness
-
Robust Architecture
- Built with Playwright for maximum reliability
- Smart retry mechanism with multiple selectors
- Automatic handling of layout changes
- Built-in duplicate detection
- Configurable request limits
Input Configuration
Configure the scraper using these parameters:
1{ 2 "startUrls": [ 3 { "url": "https://www.amazon.com/s?k=gaming+laptop" } 4 ], 5 "maxItems": 5, 6 "maxPagesPerQuery": 1, 7 "useProxy": true, 8 "scrapeMultipleImages": false 9}
Input Parameters
Parameter | Type | Description | Default |
---|---|---|---|
startUrls | Array | Amazon search or product URLs | Required |
maxItems | Integer | Maximum products to scrape | 5 |
maxPagesPerQuery | Integer | Maximum search pages to process | 1 |
useProxy | Boolean | Enable/disable proxy usage | false |
scrapeMultipleImages | Boolean | Extract all available product images | false |
Output Format
The actor outputs detailed product information in JSON format. The structure varies based on the scrapeMultipleImages
setting:
Basic Output (scrapeMultipleImages: false)
1{ 2 "title": "Example Product", 3 "url": "https://www.amazon.com/dp/B0XXXXXXXX", 4 "asin": "B0XXXXXXXX", 5 "price": 999.99, 6 "currency": "$", 7 "images": ["https://images-na.ssl-images-amazon.com/images/main.jpg"], 8 "description": "Detailed product description...", 9 "features": ["Feature 1", "Feature 2"], 10 "rating": 4.5, 11 "reviewCount": 1250, 12 "availability": "In Stock", 13 "timestamp": "2024-01-01T12:00:00.000Z" 14}
Advanced Output (scrapeMultipleImages: true)
1{ 2 "title": "Example Product", 3 "url": "https://www.amazon.com/dp/B0XXXXXXXX", 4 "asin": "B0XXXXXXXX", 5 "price": 999.99, 6 "currency": "$", 7 "images": [ 8 "https://images-na.ssl-images-amazon.com/images/main.jpg", 9 "https://images-na.ssl-images-amazon.com/images/angle1.jpg", 10 "https://images-na.ssl-images-amazon.com/images/angle2.jpg" 11 ], 12 "description": "Detailed product description...", 13 "features": ["Feature 1", "Feature 2", "Feature 3"], 14 "rating": 4.5, 15 "reviewCount": 1250, 16 "availability": "In Stock", 17 "timestamp": "2024-01-01T12:00:00.000Z" 18}
Performance Optimization
The actor includes several features to ensure reliable scraping:
-
Anti-Blocking Measures
- Random delays between requests
- Browser fingerprint randomization
- Automatic retry mechanism
- Smart handling of CAPTCHAs
-
Resource Management
- Concurrent scraping (2 parallel requests)
- Configurable request timeouts
- Memory-efficient processing
- Automatic cleanup
Usage Tips
- Start Small: Begin with a small
maxItems
value to test the setup - Use Proxies: Enable
useProxy
for production use to avoid IP blocks - Image Settings: Use
scrapeMultipleImages: true
only when you need all product images - Pagination: Adjust
maxPagesPerQuery
based on your depth requirements - Monitor Logs: Check the actor's logs for detailed progress information
Limitations
- Some product data might be unavailable depending on the page layout
- Price and availability may vary based on location/currency
- Access to some products might be restricted
- Rate limiting may apply based on Amazon's policies
Error Handling
The actor implements comprehensive error handling:
- Automatic retries for failed requests
- Detailed error logging
- Screenshot capture for debugging
- Graceful failure recovery
Support
For issues, feature requests, or custom development needs:
- Check the actor's documentation on Apify
- Create an issue in the GitHub repository
- Contact us through Apify's support channels
License
MIT License - feel free to use this actor in your projects!
Actor Metrics
1 monthly user
-
0 No stars yet
>99% runs succeeded
Created in Feb 2025
Modified 2 days ago