
Amazon Keyword and Product Scraper Pro
Pricing
Pay per usage

Amazon Keyword and Product Scraper Pro
ποΈ Amazon Search Scraper β Collect real-time product data from Amazon by just entering keywords π or product URLs π! Get title, price, ratings β, stock info & images πΌοΈ in clean structured format. Perfect for price tracking π°, market research π & competitor analysis π.
0.0 (0)
Pricing
Pay per usage
0
1
1
Last modified
3 days ago
π― Amazon Search Keywords and products Scraper
Effortlessly extract structured product data from Amazon search results or directly from product URLs β including pricing, ratings, availability, and product metadata.
π Summary
This Apify Actor can extract data from Amazon in two ways:
- By providing search keywords (it collects all products listed in search results).
- By providing product URLs (it fetches details directly from each page).
Structured data is stored in the default Dataset.
π‘ Use cases
- ποΈ E-commerce price monitoring and comparison
- π Market trend and keyword research
- π Product catalog enrichment for Amazon listings
- π§ Competitor intelligence automation
β‘ Quick Start (Apify Console)
- Open this Actor in the Apify Console.
- Click Run β Input tab.
- Paste JSON input such as:
{"queries": ["wireless earbuds", "gaming mouse"],"urls": ["https://www.amazon.com/dp/B0D1234XYZ"],"concurrency": 8}
- Optionally configure a proxy (see π Proxy Configuration below).
- Click Run β data will appear in the default Dataset.
β‘ Quick Start (CLI + API)
CLI
$apify run <ACTOR_ID> -p input.json
Where input.json
contains:
{"queries": ["laptop stand"],"urls": ["https://www.amazon.com/dp/B0D1234XYZ"],"concurrency": 5}
API (Python)
from apify_client import ApifyClientclient = ApifyClient('<APIFY_TOKEN>')run = client.actor('username~amazon-search-scraper').call(run_input={'queries': ['mechanical keyboard'],'urls': ['https://www.amazon.com/dp/B0D5678ABC'],'concurrency': 8})print(run['defaultDatasetId'])
π Inputs
π Name | π Type | β Required | βοΈ Default | π Example | π Notes |
---|---|---|---|---|---|
queries | array / string | β No | null | ["wireless earbuds"] | Amazon search keywords to collect listings |
urls | array / string | β No | null | ["https://www.amazon.com/dp/B0D1234XYZ"] | Direct product page URLs |
concurrency | integer | β No | 8 | 5 | Max concurrent product fetch tasks |
proxyConfig | object | βοΈ Optional | { "useApifyProxy": true } | { "useApifyProxy": true } | Configure proxy (see below) |
π‘ Example: Paste into Console input editor:
{"urls": ["https://www.amazon.com/dp/B0D5678ABC"], "concurrency": 4}
βοΈ Configuration
π Name | π Type | β Required | βοΈ Default | π Example | π Notes |
---|---|---|---|---|---|
OUTPUT_FILE | string | β No | amazon.search.result.json | output.json | Internal output file for backup |
REQUEST_TIMEOUT | integer | β No | 30 | 45 | Timeout in seconds per request |
APIFY_TOKEN | string | β Yes | β | <APIFY_TOKEN> | Required for Apify client/API use |
π€ Outputs
Results are stored in the default Dataset.
Example Output Item
{"asin": "B0D1234XYZ","title": "Wireless Earbuds with Noise Cancellation","url": "https://www.amazon.com/dp/B0D1234XYZ","price": "$49.99","currency": "$","brand_name": "SoundMagic","availability": "In Stock","stars": 4.5,"number_of_reviews_text": "1,234 ratings","categories": "Electronics > Audio > Headphones","images": ["https://m.media-amazon.com/images/I/xyz.jpg"]}
π Environment variables
Name | Description |
---|---|
APIFY_TOKEN | Your Apify API token for running via CLI or client |
HTTP_PROXY | (Optional) Custom HTTP proxy endpoint |
HTTPS_PROXY | (Optional) Custom HTTPS proxy endpoint |
βΆοΈ How to Run
Apify Console
- Go to Actor β Run.
- Paste JSON input containing either
queries
orurls
. - Enable proxy under Proxy tab (recommended).
- Click Start and monitor logs.
Apify CLI
$apify call username~amazon-search-scraper -p input.json
Apify Client (Python)
See Quick Start (API) example above.
β° Scheduling & Webhooks
- Use the Schedule tab in the Apify Console to run daily/weekly.
- Add a Webhook under the Webhooks tab to trigger external automation (e.g., send results to Slack or Google Sheets).
π Logs & Troubleshooting
Issue | Cause | Fix |
---|---|---|
Empty results | Amazon blocked request | Enable Apify Proxy or rotate proxies |
Timeout errors | Network latency or blocking | Increase REQUEST_TIMEOUT or reduce concurrency |
Missing product details | Page layout changed | Report issue or rerun after 24h |
π Permissions & Storage
- Uses the default Dataset for structured data.
- Temporary files saved in Actor local storage.
- Secure credentials (tokens, proxies) should be stored as Secrets in the Apify Console.
π Changelog / Versioning
v1.1.0
β Added support for scraping from direct product URLs.v1.0.0
β Initial public release.
π Notes / TODOs
- TODO: Confirm supported Amazon domains (currently assumes
amazon.com
). - TODO: Add optional input for
country_code
ordomain
selection.
π Proxy Configuration
Because this Actor sends requests to Amazon, proxy use is highly recommended.
Enable Apify Proxy (recommended)
- Open the Run page β Proxy tab.
- Check Use Apify Proxy.
- Select a proxy group (e.g.,
RESIDENTIAL
orSHADER
).
Custom Proxy Configuration
If you prefer your own proxy, go to Actor β Settings β Environment variables and set:
HTTP_PROXY=http://<PROXY_USER>:<PROXY_PASS>@<HOST>:<PORT>HTTPS_PROXY=http://<PROXY_USER>:<PROXY_PASS>@<HOST>:<PORT>
π Always store proxy credentials securely as Secrets.
TODO
Implement proxy rotation per request for improved anti-blocking resilience.
π References
π§ What I inferred from main.py
- Actor collects Amazon product listings via search keywords and direct product URLs.
- Network activity detected β proxy section included.
- Outputs JSON list of structured product data.
- Domain is assumed to be
amazon.com
β marked as TODO for domain parameterization.