Amazon Product Search
2 hours trial then $29.00/month - No credit card required now
Amazon Product Search
2 hours trial then $29.00/month - No credit card required now
Amazon Product Search is an Apify Actor that scrapes product data from Amazon search results pages. It extracts key details such as product titles, prices, images, links, ratings, review counts, and whether the product is marked as sponsored.
🔍 Amazon Product Search
Amazon Product Search is an Apify Actor that scrapes product data from Amazon search results pages. It extracts key details such as product titles, prices, images, links, ratings, review counts, and whether the product is marked as sponsored.
🚀 Features
- ✅ Scrapes product titles, prices, images, links, ratings, review counts, and sponsored indicators.
- ✅ Constructs dynamic search URLs based on a provided search query.
- ✅ Employs Puppeteer with Stealth mode to help bypass bot detection.
- ✅ Utilizes Apify's request queue and proxy configuration to manage crawling and reduce the risk of blocking.
- ✅ Stores results in an Apify Dataset in JSON format for easy export and further processing.
📥 Input Parameters
The Actor accepts the following input parameters:
Parameter | Type | Description | Default Value |
---|---|---|---|
searchQuery | string | Amazon search query (e.g., "heels"). | "heels" |
products_max | number | (Optional) Maximum number of products to scrape (if implemented). | Infinity |
Note: The actor dynamically constructs the search URL using the provided searchQuery
parameter.
📤 Output
The Actor outputs a structured JSON dataset containing product details. A sample output might look like:
1{ 2 "url": "https://www.amazon.com/s?k=heels", 3 "products": [ 4 { 5 "title": "Sponsored Ad - Example Product Title", 6 "link": "https://www.amazon.com/dp/EXAMPLE", 7 "image": "https://m.media-amazon.com/images/I/EXAMPLE.jpg", 8 "price": "$49.99", 9 "rating": "4.0 out of 5 stars", 10 "reviews": "61", 11 "sponsored": true 12 }, 13 { 14 "title": "Regular Product Title", 15 "link": "https://www.amazon.com/dp/EXAMPLE2", 16 "image": "https://m.media-amazon.com/images/I/EXAMPLE2.jpg", 17 "price": "$39.99", 18 "rating": "4.5 out of 5 stars", 19 "reviews": "102", 20 "sponsored": false 21 } 22 ] 23}
🔍 How It Works
Input Handling
The Actor reads its input via Actor.getInput()
and extracts the searchQuery
parameter. It then constructs the corresponding Amazon search URL.
Crawling & Data Extraction
Using Crawlee's PuppeteerCrawler with Puppeteer Stealth enabled, the Actor navigates to the Amazon search results page. It:
- Waits for search result elements (using the selector
div.s-result-item
) to load. - Scrolls the page to trigger lazy loading.
- Extracts product details such as:
- Title: from elements like
h2.a-size-base-plus
- Link: from relevant anchor tags (ensuring a complete URL)
- Image URL: from
img.s-image
- Price: from
span.a-offscreen
- Rating and review count
- Sponsored indicator: from any label denoting sponsorship
- Title: from elements like
Data Storage
Extracted data is stored in Apify’s default dataset, making it available for further processing or export in JSON format.
Proxy & Robustness
The Actor leverages Apify’s proxy configuration (e.g., residential proxies) and sets an extended request timeout to ensure a smooth scraping process while reducing the risk of being blocked.
⚠️ Legal Disclaimer
This project is intended for educational and research purposes only. Use of this Actor must comply with Amazon’s Terms of Service and robots.txt policies.
- Compliance: Ensure your scraping activities do not violate Amazon’s policies.
- Ethical Considerations: Avoid aggressive scraping practices that might harm Amazon's infrastructure.
- Intended Use: For commercial or production use, consider exploring Amazon’s official API solutions.
Actor Metrics
1 monthly user
-
0 No bookmarks yet
>99% runs succeeded
Created in Feb 2025
Modified 3 days ago