Amazon Product Scraper
Pricing
$19.99/month + usage
Amazon Product Scraper
The Amazon Product Scraper Apify actor extracts detailed product data from Amazon, including titles, prices, reviews, ratings, images, and ASINs. Ideal for eCommerce analytics, price monitoring, and competitor research, it delivers structured JSON or CSV outputs ready for automation workflows.
5.0 (1)
Pricing
$19.99/month + usage
0
7
6
Last modified
16 hours ago
A powerful, industrial-grade Apify Actor that extracts comprehensive product data from Amazon. This actor supports scraping product pages, search results, and individual products by ASIN, with intelligent proxy fallback mechanisms and detailed data extraction.
Why Choose Us?
- Robust & Reliable: Built with production-grade error handling, retry logic, and proxy fallback mechanisms
- Flexible Input: Supports URLs, search keywords, ASINs, and short codes - process multiple inputs simultaneously
- Smart Proxy Management: Automatic fallback from direct connection to datacenter to residential proxies when blocked
- Live Data Saving: Results are saved immediately to the dataset as they're scraped, ensuring no data loss
- Comprehensive Data: Extracts product details including price, ratings, reviews, brand, images, breadcrumbs, and descriptions
- Sorting Options: Control how search results are sorted (by price, rating, reviews, date, etc.)
- Detailed Logging: Extensive logging keeps you informed throughout the entire scraping process
Key Features
- ✅ Bulk Input Support: Process multiple URLs, keywords, or ASINs in a single run
- ✅ Multiple Input Types: Accepts full URLs, search keywords, ASINs, or product short codes
- ✅ Smart Proxy Fallback:
- Default: Direct connection (no proxy)
- Fallback 1: Datacenter proxy if blocked
- Fallback 2: Residential proxy with 3 retries if datacenter fails
- Persistent state: Once fallback occurs, continues using the working proxy
- ✅ Search Result Sorting: Sort by featured, price (ascending/descending), average review, newest, or review count
- ✅ Pagination Support: Automatically navigates through multiple pages of search results
- ✅ Live Data Saving: Results saved to dataset immediately as they're processed
- ✅ Duplicate Prevention: Automatically filters out duplicate products based on ASIN
- ✅ Comprehensive Extraction: Extracts 10+ data fields per product
- ✅ Error Handling: Robust retry logic with exponential backoff
- ✅ Rate Limiting: Built-in delays to respect Amazon's servers
Input
The actor accepts the following input configuration:
{"startUrls": [{"url": "https://www.amazon.com/s?k=keyboard"},{"url": "https://www.amazon.com/dp/B08N5WRWNW"},"keyboard","B014EUQOGK"],"sortOrder": "price-asc","maxItems": 100,"proxyConfiguration": {"useApifyProxy": false}}
Input Fields
| Field | Type | Required | Description |
|---|---|---|---|
| startUrls | Array | ✅ Yes | List of Amazon URLs, search keywords, or ASINs. Supports multiple formats: - Full URLs: https://www.amazon.com/s?k=keyboard- Product URLs: https://www.amazon.com/dp/B08N5WRWNW- Search keywords: "keyboard"- ASINs: "B08N5WRWNW" |
| sortOrder | String | No | Sort order for search results. Options: - "featured" (default) - Amazon's default sorting- "price-asc" - Price: Low to High- "price-desc" - Price: High to Low- "avg-review" - Average Customer Review- "newest" - Newest Arrivals- "review-count" - Most Reviews |
| maxItems | Integer | No | Maximum number of products to scrape per input URL/keyword. Minimum: 10, Default: 50 |
| proxyConfiguration | Object | No | Proxy configuration. Default: no proxy (direct connection). Automatically falls back to datacenter, then residential proxies if blocked. |
Output
The actor outputs structured product data to the dataset. Each product contains the following fields:
{"asin": "B014EUQOGK","title": "Logitech K400 Plus Wireless Touch TV Keyboard With Easy Media Control and Built-in Touchpad, HTPC Keyboard for PC-connected TV, Windows, Android, ChromeOS, Laptop, Tablet - Black","url": "https://www.amazon.com/dp/B014EUQOGK","brand": "Logitech","stars": 4.4,"reviewsCount": 39,"thumbnailImage": "https://m.media-amazon.com/images/I/51yjnWJ5urL._AC_UY218_.jpg","breadCrumbs": "keyboard piano > keyboard wireless > mouse > wireless keyboard > gaming keyboard","description": "Rated 4+ stars. Purchased often. Returned infrequently","price": {"value": 24.99,"currency": "$"}}
Output Fields
| Field | Type | Description |
|---|---|---|
| asin | String | Amazon Standard Identification Number (unique product identifier) |
| title | String | Full product title |
| url | String | Direct URL to the product page |
| brand | String | Product brand name (may be null if not available) |
| stars | Number | Average star rating (1.0 - 5.0, may be null) |
| reviewsCount | Integer | Number of customer reviews (may be null) |
| thumbnailImage | String | URL to product thumbnail image |
| breadCrumbs | String | Category breadcrumbs separated by " > " (may be null) |
| description | String | Product description or key features (may be null) |
| price | Object | Price information: - value: Price as number (may be null if unavailable)- currency: Currency symbol (default: "$") |
🚀 How to Use the Actor (via Apify Console)
- Log in to Apify Console and navigate to Actors
- Find the
amazon-product-scraperactor and click on it - Configure inputs:
- Add URLs, keywords, or ASINs in the
startUrlsfield - Optionally set
sortOrderandmaxItems - Configure proxy settings if needed (default: no proxy)
- Add URLs, keywords, or ASINs in the
- Run the actor by clicking the "Start" button
- Monitor progress in real-time through the logs
- Access results in the OUTPUT tab once the run completes
- Export results to JSON or CSV format
Example Use Cases
Scrape a single product:
{"startUrls": [{"url": "https://www.amazon.com/dp/B08N5WRWNW"}],"maxItems": 1}
Search and scrape products:
{"startUrls": [{"url": "https://www.amazon.com/s?k=wireless+keyboard"}],"sortOrder": "price-asc","maxItems": 50}
Bulk processing:
{"startUrls": ["keyboard","mouse","B014EUQOGK","https://www.amazon.com/s?k=laptop"],"maxItems": 100}
Best Use Cases
- Price Monitoring: Track product prices over time by scraping regularly
- Market Research: Gather comprehensive product data for competitive analysis
- Product Catalog Building: Create product databases with detailed information
- Inventory Management: Monitor product availability and details
- SEO & Marketing: Extract product descriptions and metadata for content creation
- Data Analysis: Collect product data for statistical analysis and trends
- E-commerce Integration: Feed product data into your own platforms
Frequently Asked Questions
Q: What types of inputs can I provide?
A: You can provide full Amazon URLs, search keywords (e.g., "keyboard"), ASINs (e.g., "B08N5WRWNW"), or product short codes. The actor automatically detects the input type.
Q: How does the proxy fallback work?
A: By default, the actor uses no proxy. If Amazon blocks the request, it automatically falls back to a datacenter proxy. If that's also blocked, it switches to a residential proxy and retries up to 3 times. Once a fallback occurs, it continues using that proxy for all subsequent requests.
Q: Can I scrape multiple pages of search results?
A: Yes! The actor automatically handles pagination. Set maxItems to a higher number to scrape more products across multiple pages.
Q: What happens if a product doesn't have a price?
A: The price value field will be null, but the currency field will still contain the currency symbol (usually "$").
Q: How are duplicates handled?
A: The actor tracks all scraped ASINs and automatically filters out duplicates, ensuring each product appears only once in the output.
Q: Can I sort search results?
A: Yes! Use the sortOrder field to sort by featured (default), price (ascending/descending), average review, newest, or review count.
Q: How long does scraping take?
A: The duration depends on the number of products, pages, and network conditions. The actor includes built-in rate limiting (1 second delay between pages) to respect Amazon's servers.
Q: What if Amazon blocks my requests?
A: The actor has intelligent proxy fallback mechanisms. If direct connection fails, it automatically switches to datacenter proxies, and if needed, residential proxies with retry logic.
Q: Can I scrape products from different Amazon marketplaces?
A: Yes, the actor supports any Amazon domain. Just provide the appropriate URLs (e.g., amazon.co.uk, amazon.de, etc.).
Q: Is the data saved in real-time?
A: Yes! Results are saved to the dataset immediately as they're scraped, so you won't lose data if the actor is interrupted.
Support and Feedback
For issues, questions, or feature requests, please contact support through the Apify platform or create an issue in the actor's repository.
Cautions
- Legal Compliance: This actor scrapes only publicly available data from Amazon. Ensure your use case complies with Amazon's Terms of Service and applicable laws in your jurisdiction.
- Rate Limiting: The actor includes built-in rate limiting, but excessive scraping may still result in temporary blocks. Use proxy configuration when scraping large volumes.
- Data Accuracy: Product information (prices, availability, etc.) may change frequently. Always verify critical data.
- Terms of Service: Users are responsible for ensuring their usage complies with Amazon's Terms of Service and relevant data protection regulations.
- No Private Data: This actor only accesses publicly available product pages and does not access any private or account-specific information.
Version: 0.1
Last Updated: 2025
