Amazon Data Extractor
Pricing
Pay per usage
Amazon Data Extractor
All-in-one Amazon data extraction — search, products, reviews, stores, and Q&A. 10 domains, anti-bot stealth, auto-pagination, CAPTCHA retry, and residential proxy support. Ready to scale.
Pricing
Pay per usage
Rating
0.0
(0)
Developer

Rizvi Ahmed
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
5 days ago
Last modified
Categories
Share
Amazon Data Extractor lets you extract product data from Amazon at scale — search results, product details, customer reviews, seller stores, and Q&A — helping you monitor prices, analyze competitors, and gather market intelligence with just a few clicks.
What is Amazon Data Extractor?
Amazon Data Extractor automates the process of browsing Amazon and extracting structured data from product pages, search results, reviews, seller storefronts, and customer Q&A sections. It handles pagination, CAPTCHA retries, and anti-bot detection so you can focus on the data.
5 scraping modes to cover every Amazon data need:
- SEARCH — Search Amazon by keyword, get product listings with prices, ratings, and images
- PRODUCT — Get full product details: specs, features, images, variations, brand info, and embedded Q&A
- REVIEWS — Extract customer reviews with ratings, dates, verified purchase status, and helpful votes
- STORE — Browse an entire seller's storefront and list all their products
- QNA — Scrape customer questions and answers from product pages
Use cases
- Price monitoring: Track prices across thousands of products. Set up scheduled runs to detect price drops, compare competitor pricing, or monitor MAP (Minimum Advertised Price) compliance.
- Market research: Analyze search results for any keyword to understand product landscape, price ranges, ratings distribution, and market saturation.
- Competitor analysis: Scrape a competitor's entire seller store to see their full product catalog, pricing strategy, and customer reception.
- Review analysis & sentiment: Extract customer reviews in bulk to analyze sentiment, identify common complaints, spot product quality trends, and inform product development.
- Product data enrichment: Pull detailed specs, features, images, and variations for product catalogs, comparison engines, or e-commerce databases.
- Lead generation: Find sellers and brands in specific categories by scraping search results and store listings.
- Q&A mining: Extract customer questions and answers to understand buyer concerns, improve product listings, or build FAQ content.
- Multi-marketplace tracking: Supports 10 Amazon domains (US, UK, DE, FR, IT, ES, CA, JP, IN, AU) for international market coverage.
What data does Amazon Scraper extract?
Search mode
- Product title, ASIN, and URL
- Current price
- Average rating and review count
- Product image
- Pagination support (auto or fixed page count)
Product mode
- Full product title and brand
- Current price and list price (strikethrough price)
- Rating and review count
- Availability status
- Product description and feature bullet points
- All product images (high resolution)
- Technical specifications table
- Product variations (size, color, etc.) with ASINs
- Embedded Q&A from the product page
Reviews mode
- Reviewer name
- Star rating (1-5)
- Review title and body text
- Review date
- Verified purchase badge
- Helpful vote count
Store mode
- All products from a seller's storefront
- Store/seller name
- Product title, ASIN, price, rating, review count, and image for each listing
- Full pagination support
QNA mode
- Customer questions and answers
- Who answered (seller vs. customer)
- Vote count
- Total question count for the product
Input
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
mode | string | Yes | SEARCH | Scraping mode: SEARCH, PRODUCT, REVIEWS, STORE, or QNA |
query | string | SEARCH mode | — | Search keyword(s) to look up on Amazon |
asins | string[] | PRODUCT / REVIEWS / QNA | — | List of Amazon ASINs to scrape |
sellerId | string | STORE mode | — | Amazon Seller ID for store scraping |
maxPages | integer | No | 1 | Maximum number of pages to scrape. Set to 0 for all available pages |
reviewSort | string | No | helpful | Review sort order: helpful or recent |
amazonDomain | string | No | https://www.amazon.com | Amazon domain to scrape (10 domains supported) |
maxConcurrency | integer | No | 3 | Number of parallel browser pages (1–10) |
proxyConfig | object | No | Auto-detect | Proxy configuration. Leave empty for automatic proxy selection |
Supported Amazon domains
| Domain | URL |
|---|---|
| United States | https://www.amazon.com |
| United Kingdom | https://www.amazon.co.uk |
| Germany | https://www.amazon.de |
| France | https://www.amazon.fr |
| Italy | https://www.amazon.it |
| Spain | https://www.amazon.es |
| Canada | https://www.amazon.ca |
| Japan | https://www.amazon.co.jp |
| India | https://www.amazon.in |
| Australia | https://www.amazon.com.au |
Example inputs
Search for products
{"mode": "SEARCH","query": "wireless headphones","maxPages": 3,"amazonDomain": "https://www.amazon.com"}
Get product details for multiple ASINs
{"mode": "PRODUCT","asins": ["B09V3KXJPB", "B0CHWRXH8B", "B0D1XD1ZV3"]}
Scrape reviews for a product
{"mode": "REVIEWS","asins": ["B09V3KXJPB"],"reviewSort": "recent"}
Scrape an entire seller store
{"mode": "STORE","sellerId": "A2L77EE7U53NZ2","maxPages": 0}
Get Q&A for a product
{"mode": "QNA","asins": ["B09V3KXJPB"]}
Output
Results are saved to the default Apify dataset. You can download them as JSON, CSV, Excel, XML, or HTML from the Storage tab.
Search results output
{"mode": "SEARCH","scrapedAt": "2026-03-01T12:00:00.000Z","query": "wireless headphones","page": 1,"asin": "B09V3KXJPB","title": "Sony WH-1000XM5 Wireless Noise Canceling Headphones","price": 278.00,"rating": 4.6,"reviewCount": 12453,"image": "https://m.media-amazon.com/images/I/51aX234NPOL._AC_SL1500_.jpg","url": "https://www.amazon.com/dp/B09V3KXJPB"}
Product details output
{"mode": "PRODUCT","scrapedAt": "2026-03-01T12:00:00.000Z","asin": "B09V3KXJPB","title": "Sony WH-1000XM5 Wireless Industry Leading Noise Canceling Headphones","brand": "Sony","price": 278.00,"listPrice": 399.99,"currency": "USD","rating": 4.6,"reviewCount": 12453,"availability": "In Stock","description": "The WH-1000XM5 headphones rewrite the rules...","features": ["Industry Leading Noise Cancelation","Exceptional Sound Quality","Crystal clear hands-free calling","Up to 30 hours battery life"],"images": ["https://m.media-amazon.com/images/I/51aX234NPOL._AC_SL1500_.jpg"],"specs": {"Brand": "Sony","Color": "Black","Connectivity": "Bluetooth 5.2","Weight": "250 Grams"},"variations": {"parentAsin": "B09V3KXJPB","dimensions": {"Color": ["Black", "Silver", "Midnight Blue"]},"variants": [{ "asin": "B09V3KXJPB", "Color": "Black" },{ "asin": "B09V3KY4PB", "Color": "Silver" }]},"questionsAndAnswers": [{"question": "Does this work with iPhone?","answer": "Yes, it works with any Bluetooth device.","answeredBy": "seller"}],"url": "https://www.amazon.com/dp/B09V3KXJPB"}
Reviews output
{"mode": "REVIEWS","scrapedAt": "2026-03-01T12:00:00.000Z","asin": "B09V3KXJPB","reviewerName": "John D.","rating": 5,"title": "Best noise canceling headphones I've owned","body": "The sound quality is incredible and the noise canceling is a huge step up from the XM4...","date": "2026-01-15","verified": true,"helpfulCount": 42}
Store output
{"mode": "STORE","scrapedAt": "2026-03-01T12:00:00.000Z","sellerId": "A2L77EE7U53NZ2","storeName": "TechStore Official","page": 1,"asin": "B0CHWRXH8B","title": "Wireless Earbuds with Charging Case","price": 39.99,"rating": 4.3,"reviewCount": 2841,"image": "https://m.media-amazon.com/images/I/example.jpg","url": "https://www.amazon.com/dp/B0CHWRXH8B"}
Q&A output
{"mode": "QNA","scrapedAt": "2026-03-01T12:00:00.000Z","asin": "B09V3KXJPB","productTitle": "Sony WH-1000XM5 Wireless Headphones","totalQuestions": 287,"extractedCount": 4,"question": "Can I use this wired?","answer": "Yes, a 3.5mm audio cable is included in the box.","answeredBy": "customer","votes": 15}
Proxy recommendation
Amazon aggressively blocks automated access. For reliable results, residential proxies are strongly recommended.
The scraper auto-detects the best available proxy on your Apify account in this order:
- RESIDENTIAL (best success rate, recommended)
- BUYPROXIES94952 (datacenter, good for low-volume)
- Default Apify Proxy (may work for small runs)
To manually configure residential proxies:
{"proxyConfig": {"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"]}}
Tip: If you're scraping non-US Amazon domains (e.g.
amazon.de,amazon.co.jp), residential proxies from the target country will give the best results. Apify's residential pool automatically routes through the appropriate country.
Cost estimation
Amazon Data Extractor uses Puppeteer (headless Chrome), so it requires more compute resources than simple HTTP scrapers. Below are approximate costs on the Apify platform:
| Scenario | Est. compute units | Est. cost (USD) |
|---|---|---|
| Search 1 keyword, 1 page (~20 products) | ~0.05 CU | ~$0.025 |
| Search 1 keyword, 10 pages (~200 products) | ~0.5 CU | ~$0.25 |
| 10 product detail pages | ~0.5 CU | ~$0.25 |
| 50 product detail pages | ~2.5 CU | ~$1.25 |
| Reviews for 1 product | ~0.05 CU | ~$0.025 |
| Full seller store (50 pages) | ~2.5 CU | ~$1.25 |
| Q&A for 10 products | ~0.5 CU | ~$0.25 |
Costs are approximate and depend on proxy type, page complexity, retry count, and concurrency settings. Residential proxies add additional cost per GB of traffic. Check Apify pricing for current rates.
Tips to reduce cost
- Use lower concurrency (
maxConcurrency: 1-2) if speed isn't critical — this reduces memory usage - Set a specific
maxPagesinstead of0(all pages) to control run duration - Use datacenter proxies for smaller runs where occasional CAPTCHAs are acceptable
- Batch multiple ASINs in a single run instead of running separate actors per ASIN
Tips and tricks
Getting the best results
- Start small: Test with 1 page or 1 ASIN first before scaling up to verify the data format meets your needs.
- Use residential proxies: Amazon's anti-bot system is aggressive. Residential IPs have significantly higher success rates than datacenter IPs.
- Avoid peak hours: Running scrapes during off-peak hours (late night US time) can reduce CAPTCHA frequency.
- Set appropriate concurrency: Higher concurrency (
5-10) is faster but more likely to trigger blocks. Start with3and adjust.
Finding Amazon ASINs
ASINs (Amazon Standard Identification Numbers) are 10-character alphanumeric codes. You can find them:
- In the product URL:
amazon.com/dp/**B09V3KXJPB** - In the "Product Information" section of any product page
- By running a SEARCH mode scrape first, then using the ASINs from the results
Finding Seller IDs
Seller IDs are needed for STORE mode. To find a seller's ID:
- Go to any product sold by the seller
- Click the seller name under "Sold by"
- The seller ID is in the URL:
amazon.com/s?me=**A2L77EE7U53NZ2**
Integrations
Amazon Data Extractor works with the full Apify ecosystem:
- API access: Run the scraper programmatically via the Apify API from any language
- Scheduled runs: Set up automatic recurring scrapes (hourly, daily, weekly) to keep your data fresh
- Webhooks: Get notified when a run completes and trigger downstream workflows
- Integrations: Connect with Google Sheets, Slack, Zapier, Make, or any webhook-compatible service
- Dataset export: Download results as JSON, CSV, Excel, XML, or HTML
Using the API
# Start a runcurl -X POST "https://api.apify.com/v2/acts/YOUR_ACTOR_ID/runs" \-H "Authorization: Bearer YOUR_API_TOKEN" \-H "Content-Type: application/json" \-d '{"mode": "SEARCH","query": "laptop stand","maxPages": 2}'
# Python examplefrom apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")run = client.actor("YOUR_ACTOR_ID").call(run_input={"mode": "PRODUCT","asins": ["B09V3KXJPB", "B0CHWRXH8B"],})for item in client.dataset(run["defaultDatasetId"]).iterate_items():print(item["title"], item["price"])
// JavaScript exampleimport { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });const run = await client.actor('YOUR_ACTOR_ID').call({mode: 'SEARCH',query: 'wireless earbuds',maxPages: 5,});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(items);
FAQ
How does Amazon Data Extractor handle CAPTCHAs?
The scraper automatically detects Amazon CAPTCHA pages. When a CAPTCHA is encountered, the request is marked as failed and automatically retried (up to 3 times) with a different proxy IP. Using residential proxies dramatically reduces CAPTCHA frequency.
How many products can I scrape per run?
There's no hard limit. A SEARCH run with maxPages: 0 will paginate through all available search result pages (Amazon typically shows up to ~400 pages per query). For PRODUCT mode, you can pass hundreds of ASINs in a single run.
Can I scrape Amazon in different languages?
The scraper returns data in whatever language Amazon displays for the selected domain. For example, amazon.de returns German text, amazon.co.jp returns Japanese. The scraper itself handles all domains identically.
Why am I getting empty results?
Common causes:
- Wrong proxy: Datacenter proxies get blocked more often. Switch to residential proxies.
- Invalid ASIN/Seller ID: Double-check that ASINs are valid 10-character codes and seller IDs are correct.
- CAPTCHA blocks: If all retries fail due to CAPTCHAs, the result will be empty. Try with residential proxies or lower concurrency.
- Geo-restriction: Some products or stores are only available in certain regions. Make sure the Amazon domain matches your target market.
What's the difference between PRODUCT and SEARCH mode?
- SEARCH returns a list of products for a keyword (like browsing Amazon search results). Data is limited to what's visible in the search listing.
- PRODUCT visits each product's full detail page, extracting much richer data including descriptions, specs, variations, images, and embedded Q&A.
Does the scraper extract all reviews for a product?
Reviews are scraped from the product detail page, which shows a limited number of top reviews. Amazon now requires login for dedicated review pages (/product-reviews/), so bulk review extraction is limited to the reviews visible on the main product page (typically 8-10 reviews per product).
Can I run this on a schedule?
Yes. In the Apify Console, go to your actor's Schedules tab to set up recurring runs (e.g., daily price monitoring). You can also create schedules via the API.
Technical details
- Runtime: Node.js 20 with Puppeteer (headless Chrome)
- Anti-detection: puppeteer-extra-stealth plugin, rotating user agents, request interception (blocks images/CSS/fonts to speed up loading)
- Resource optimization: Blocks unnecessary resources (images, stylesheets, fonts, media) to reduce bandwidth and increase speed
- Retry logic: Failed requests are automatically retried up to 3 times with different proxy IPs
- US delivery address: For STORE mode, the scraper automatically sets a US delivery address (ZIP 10001) to avoid geo-filtered empty results