Amazon Keyword and Product Scraper Pro
Pricing
$15.00/month + usage
Amazon Keyword and Product Scraper Pro
ποΈ Amazon Search Scraper β Collect real-time product data from Amazon by just entering keywords π or product URLs π! Get title, price, ratings β, stock info & images πΌοΈ in clean structured format. Perfect for price tracking π°, market research π & competitor analysis π.
Pricing
$15.00/month + usage
Rating
5.0
(1)
Developer

Neuro Scraper
Actor stats
0
Bookmarked
6
Total users
4
Monthly active users
20 days ago
Last modified
Categories
Share
π― Amazon Search Keywords and products Scraper
Effortlessly extract structured product data from Amazon search results or directly from product URLs β including pricing, ratings, availability, and product metadata.
π Summary
This Apify Actor can extract data from Amazon in two ways:
- By providing search keywords (it collects all products listed in search results).
- By providing product URLs (it fetches details directly from each page).
Structured data is stored in the default Dataset.
π‘ Use cases
- ποΈ E-commerce price monitoring and comparison
- π Market trend and keyword research
- π Product catalog enrichment for Amazon listings
- π§ Competitor intelligence automation
β‘ Quick Start (Apify Console)
- Open this Actor in the Apify Console.
- Click Run β Input tab.
- Paste JSON input such as:
{"queries": ["wireless earbuds", "gaming mouse"],"urls": ["https://www.amazon.com/dp/B0D1234XYZ"],"concurrency": 8}
- Optionally configure a proxy (see π Proxy Configuration below).
- Click Run β data will appear in the default Dataset.
β‘ Quick Start (CLI + API)
CLI
$apify run <ACTOR_ID> -p input.json
Where input.json contains:
{"queries": ["laptop stand"],"urls": ["https://www.amazon.com/dp/B0D1234XYZ"],"concurrency": 5}
API (Python)
from apify_client import ApifyClientclient = ApifyClient('<APIFY_TOKEN>')run = client.actor('username~amazon-search-scraper').call(run_input={'queries': ['mechanical keyboard'],'urls': ['https://www.amazon.com/dp/B0D5678ABC'],'concurrency': 8})print(run['defaultDatasetId'])
π Inputs
| π Name | π Type | β Required | βοΈ Default | π Example | π Notes |
|---|---|---|---|---|---|
| queries | array / string | β No | null | ["wireless earbuds"] | Amazon search keywords to collect listings |
| urls | array / string | β No | null | ["https://www.amazon.com/dp/B0D1234XYZ"] | Direct product page URLs |
| concurrency | integer | β No | 8 | 5 | Max concurrent product fetch tasks |
| proxyConfig | object | βοΈ Optional | { "useApifyProxy": true } | { "useApifyProxy": true } | Configure proxy (see below) |
π‘ Example: Paste into Console input editor:
{"urls": ["https://www.amazon.com/dp/B0D5678ABC"], "concurrency": 4}
βοΈ Configuration
| π Name | π Type | β Required | βοΈ Default | π Example | π Notes |
|---|---|---|---|---|---|
| OUTPUT_FILE | string | β No | amazon.search.result.json | output.json | Internal output file for backup |
| REQUEST_TIMEOUT | integer | β No | 30 | 45 | Timeout in seconds per request |
| APIFY_TOKEN | string | β Yes | β | <APIFY_TOKEN> | Required for Apify client/API use |
π€ Outputs
Results are stored in the default Dataset.
Example Output Item
{"asin": "B0D1234XYZ","title": "Wireless Earbuds with Noise Cancellation","url": "https://www.amazon.com/dp/B0D1234XYZ","price": "$49.99","currency": "$","brand_name": "SoundMagic","availability": "In Stock","stars": 4.5,"number_of_reviews_text": "1,234 ratings","categories": "Electronics > Audio > Headphones","images": ["https://m.media-amazon.com/images/I/xyz.jpg"]}
π Environment variables
| Name | Description |
|---|---|
APIFY_TOKEN | Your Apify API token for running via CLI or client |
HTTP_PROXY | (Optional) Custom HTTP proxy endpoint |
HTTPS_PROXY | (Optional) Custom HTTPS proxy endpoint |
βΆοΈ How to Run
Apify Console
- Go to Actor β Run.
- Paste JSON input containing either
queriesorurls. - Enable proxy under Proxy tab (recommended).
- Click Start and monitor logs.
Apify CLI
$apify call username~amazon-search-scraper -p input.json
Apify Client (Python)
See Quick Start (API) example above.
β° Scheduling & Webhooks
- Use the Schedule tab in the Apify Console to run daily/weekly.
- Add a Webhook under the Webhooks tab to trigger external automation (e.g., send results to Slack or Google Sheets).
π Logs & Troubleshooting
| Issue | Cause | Fix |
|---|---|---|
| Empty results | Amazon blocked request | Enable Apify Proxy or rotate proxies |
| Timeout errors | Network latency or blocking | Increase REQUEST_TIMEOUT or reduce concurrency |
| Missing product details | Page layout changed | Report issue or rerun after 24h |
π Permissions & Storage
- Uses the default Dataset for structured data.
- Temporary files saved in Actor local storage.
- Secure credentials (tokens, proxies) should be stored as Secrets in the Apify Console.
π Changelog / Versioning
v1.1.0β Added support for scraping from direct product URLs.v1.0.0β Initial public release.
π Notes / TODOs
- TODO: Confirm supported Amazon domains (currently assumes
amazon.com). - TODO: Add optional input for
country_codeordomainselection.
π Proxy Configuration
Because this Actor sends requests to Amazon, proxy use is highly recommended.
Enable Apify Proxy (recommended)
- Open the Run page β Proxy tab.
- Check Use Apify Proxy.
- Select a proxy group (e.g.,
RESIDENTIALorSHADER).
Custom Proxy Configuration
If you prefer your own proxy, go to Actor β Settings β Environment variables and set:
HTTP_PROXY=http://<PROXY_USER>:<PROXY_PASS>@<HOST>:<PORT>HTTPS_PROXY=http://<PROXY_USER>:<PROXY_PASS>@<HOST>:<PORT>
π Always store proxy credentials securely as Secrets.
TODO
Implement proxy rotation per request for improved anti-blocking resilience.
π References
π§ What I inferred from main.py
- Actor collects Amazon product listings via search keywords and direct product URLs.
- Network activity detected β proxy section included.
- Outputs JSON list of structured product data.
- Domain is assumed to be
amazon.comβ marked as TODO for domain parameterization.