Amazon Scraper
Pricing
$19.00/month + usage
Amazon Scraper
๐๏ธ Amazon Search Scraper โ Collect real-time product data from Amazon by just entering keywords ๐ or product URLs ๐! Get title, price, ratings โญ, stock info & images ๐ผ๏ธ in clean structured format. Perfect for price tracking ๐ฐ, market research ๐ & competitor analysis ๐.
Pricing
$19.00/month + usage
Rating
5.0
(1)
Developer

Neuro Scraper
Actor stats
0
Bookmarked
14
Total users
0
Monthly active users
a month ago
Last modified
Categories
Share
๐ฏ Amazon Search Keywords and products Scraper
Effortlessly extract structured product data from Amazon search results or directly from product URLs โ including pricing, ratings, availability, and product metadata.
๐ Summary
This Apify Actor can extract data from Amazon in two ways:
- By providing search keywords (it collects all products listed in search results).
- By providing product URLs (it fetches details directly from each page).
Structured data is stored in the default Dataset.
๐ก Use cases
- ๐๏ธ E-commerce price monitoring and comparison
- ๐ Market trend and keyword research
- ๐ Product catalog enrichment for Amazon listings
- ๐ง Competitor intelligence automation
โก Quick Start (Apify Console)
- Open this Actor in the Apify Console.
- Click Run โ Input tab.
- Paste JSON input such as:
{"queries": ["wireless earbuds", "gaming mouse"],"urls": ["https://www.amazon.com/dp/B0D1234XYZ"],"concurrency": 8}
- Optionally configure a proxy (see ๐ Proxy Configuration below).
- Click Run โ data will appear in the default Dataset.
โก Quick Start (CLI + API)
CLI
$apify run <ACTOR_ID> -p input.json
Where input.json contains:
{"queries": ["laptop stand"],"urls": ["https://www.amazon.com/dp/B0D1234XYZ"],"concurrency": 5}
API (Python)
from apify_client import ApifyClientclient = ApifyClient('<APIFY_TOKEN>')run = client.actor('username~amazon-search-scraper').call(run_input={'queries': ['mechanical keyboard'],'urls': ['https://www.amazon.com/dp/B0D5678ABC'],'concurrency': 8})print(run['defaultDatasetId'])
๐ Inputs
| ๐ Name | ๐ Type | โ Required | โ๏ธ Default | ๐ Example | ๐ Notes |
|---|---|---|---|---|---|
| queries | array / string | โ No | null | ["wireless earbuds"] | Amazon search keywords to collect listings |
| urls | array / string | โ No | null | ["https://www.amazon.com/dp/B0D1234XYZ"] | Direct product page URLs |
| concurrency | integer | โ No | 8 | 5 | Max concurrent product fetch tasks |
| proxyConfig | object | โ๏ธ Optional | { "useApifyProxy": true } | { "useApifyProxy": true } | Configure proxy (see below) |
๐ก Example: Paste into Console input editor:
{"urls": ["https://www.amazon.com/dp/B0D5678ABC"], "concurrency": 4}
โ๏ธ Configuration
| ๐ Name | ๐ Type | โ Required | โ๏ธ Default | ๐ Example | ๐ Notes |
|---|---|---|---|---|---|
| OUTPUT_FILE | string | โ No | amazon.search.result.json | output.json | Internal output file for backup |
| REQUEST_TIMEOUT | integer | โ No | 30 | 45 | Timeout in seconds per request |
| APIFY_TOKEN | string | โ Yes | โ | <APIFY_TOKEN> | Required for Apify client/API use |
๐ค Outputs
Results are stored in the default Dataset.
Example Output Item
{"asin": "B0D1234XYZ","title": "Wireless Earbuds with Noise Cancellation","url": "https://www.amazon.com/dp/B0D1234XYZ","price": "$49.99","currency": "$","brand_name": "SoundMagic","availability": "In Stock","stars": 4.5,"number_of_reviews_text": "1,234 ratings","categories": "Electronics > Audio > Headphones","images": ["https://m.media-amazon.com/images/I/xyz.jpg"]}
๐ Environment variables
| Name | Description |
|---|---|
APIFY_TOKEN | Your Apify API token for running via CLI or client |
HTTP_PROXY | (Optional) Custom HTTP proxy endpoint |
HTTPS_PROXY | (Optional) Custom HTTPS proxy endpoint |
โถ๏ธ How to Run
Apify Console
- Go to Actor โ Run.
- Paste JSON input containing either
queriesorurls. - Enable proxy under Proxy tab (recommended).
- Click Start and monitor logs.
Apify CLI
$apify call username~amazon-search-scraper -p input.json
Apify Client (Python)
See Quick Start (API) example above.
โฐ Scheduling & Webhooks
- Use the Schedule tab in the Apify Console to run daily/weekly.
- Add a Webhook under the Webhooks tab to trigger external automation (e.g., send results to Slack or Google Sheets).
๐ Logs & Troubleshooting
| Issue | Cause | Fix |
|---|---|---|
| Empty results | Amazon blocked request | Enable Apify Proxy or rotate proxies |
| Timeout errors | Network latency or blocking | Increase REQUEST_TIMEOUT or reduce concurrency |
| Missing product details | Page layout changed | Report issue or rerun after 24h |
๐ Permissions & Storage
- Uses the default Dataset for structured data.
- Temporary files saved in Actor local storage.
- Secure credentials (tokens, proxies) should be stored as Secrets in the Apify Console.
๐ Changelog / Versioning
v1.1.0โ Added support for scraping from direct product URLs.v1.0.0โ Initial public release.
๐ Notes / TODOs
- TODO: Confirm supported Amazon domains (currently assumes
amazon.com). - TODO: Add optional input for
country_codeordomainselection.
๐ Proxy Configuration
Because this Actor sends requests to Amazon, proxy use is highly recommended.
Enable Apify Proxy (recommended)
- Open the Run page โ Proxy tab.
- Check Use Apify Proxy.
- Select a proxy group (e.g.,
RESIDENTIALorSHADER).
Custom Proxy Configuration
If you prefer your own proxy, go to Actor โ Settings โ Environment variables and set:
HTTP_PROXY=http://<PROXY_USER>:<PROXY_PASS>@<HOST>:<PORT>HTTPS_PROXY=http://<PROXY_USER>:<PROXY_PASS>@<HOST>:<PORT>
๐ Always store proxy credentials securely as Secrets.
TODO
Implement proxy rotation per request for improved anti-blocking resilience.
๐ References
๐ง What I inferred from main.py
- Actor collects Amazon product listings via search keywords and direct product URLs.
- Network activity detected โ proxy section included.
- Outputs JSON list of structured product data.
- Domain is assumed to be
amazon.comโ marked as TODO for domain parameterization.