Amazon  Scraper avatar
Amazon Scraper

Pricing

$19.00/month + usage

Go to Apify Store
Amazon  Scraper

Amazon Scraper

๐Ÿ›๏ธ Amazon Search Scraper โ€” Collect real-time product data from Amazon by just entering keywords ๐Ÿ”Ž or product URLs ๐Ÿ”—! Get title, price, ratings โญ, stock info & images ๐Ÿ–ผ๏ธ in clean structured format. Perfect for price tracking ๐Ÿ’ฐ, market research ๐Ÿ“Š & competitor analysis ๐Ÿš€.

Pricing

$19.00/month + usage

Rating

5.0

(1)

Developer

Neuro Scraper

Neuro Scraper

Maintained by Community

Actor stats

0

Bookmarked

14

Total users

0

Monthly active users

a month ago

Last modified

Share

๐ŸŽฏ Amazon Search Keywords and products Scraper

Effortlessly extract structured product data from Amazon search results or directly from product URLs โ€” including pricing, ratings, availability, and product metadata.


๐Ÿ“– Summary

This Apify Actor can extract data from Amazon in two ways:

  1. By providing search keywords (it collects all products listed in search results).
  2. By providing product URLs (it fetches details directly from each page).

Structured data is stored in the default Dataset.


๐Ÿ’ก Use cases

  • ๐Ÿ›๏ธ E-commerce price monitoring and comparison
  • ๐Ÿ“Š Market trend and keyword research
  • ๐Ÿ” Product catalog enrichment for Amazon listings
  • ๐Ÿง  Competitor intelligence automation

โšก Quick Start (Apify Console)

  1. Open this Actor in the Apify Console.
  2. Click Run โ†’ Input tab.
  3. Paste JSON input such as:
{
"queries": ["wireless earbuds", "gaming mouse"],
"urls": ["https://www.amazon.com/dp/B0D1234XYZ"],
"concurrency": 8
}
  1. Optionally configure a proxy (see ๐ŸŒ Proxy Configuration below).
  2. Click Run โ€” data will appear in the default Dataset.

โšก Quick Start (CLI + API)

CLI

$apify run <ACTOR_ID> -p input.json

Where input.json contains:

{
"queries": ["laptop stand"],
"urls": ["https://www.amazon.com/dp/B0D1234XYZ"],
"concurrency": 5
}

API (Python)

from apify_client import ApifyClient
client = ApifyClient('<APIFY_TOKEN>')
run = client.actor('username~amazon-search-scraper').call(run_input={
'queries': ['mechanical keyboard'],
'urls': ['https://www.amazon.com/dp/B0D5678ABC'],
'concurrency': 8
})
print(run['defaultDatasetId'])

๐Ÿ“ Inputs

๐Ÿ”‘ Name๐Ÿ“ Typeโ“ Requiredโš™๏ธ Default๐Ÿ“Œ Example๐Ÿ“ Notes
queriesarray / stringโŒ Nonull["wireless earbuds"]Amazon search keywords to collect listings
urlsarray / stringโŒ Nonull["https://www.amazon.com/dp/B0D1234XYZ"]Direct product page URLs
concurrencyintegerโŒ No85Max concurrent product fetch tasks
proxyConfigobjectโš™๏ธ Optional{ "useApifyProxy": true }{ "useApifyProxy": true }Configure proxy (see below)

๐Ÿ’ก Example: Paste into Console input editor:

{"urls": ["https://www.amazon.com/dp/B0D5678ABC"], "concurrency": 4}

โš™๏ธ Configuration

๐Ÿ”‘ Name๐Ÿ“ Typeโ“ Requiredโš™๏ธ Default๐Ÿ“Œ Example๐Ÿ“ Notes
OUTPUT_FILEstringโŒ Noamazon.search.result.jsonoutput.jsonInternal output file for backup
REQUEST_TIMEOUTintegerโŒ No3045Timeout in seconds per request
APIFY_TOKENstringโœ… Yesโ€”<APIFY_TOKEN>Required for Apify client/API use

๐Ÿ“ค Outputs

Results are stored in the default Dataset.

Example Output Item

{
"asin": "B0D1234XYZ",
"title": "Wireless Earbuds with Noise Cancellation",
"url": "https://www.amazon.com/dp/B0D1234XYZ",
"price": "$49.99",
"currency": "$",
"brand_name": "SoundMagic",
"availability": "In Stock",
"stars": 4.5,
"number_of_reviews_text": "1,234 ratings",
"categories": "Electronics > Audio > Headphones",
"images": ["https://m.media-amazon.com/images/I/xyz.jpg"]
}

๐Ÿ”‘ Environment variables

NameDescription
APIFY_TOKENYour Apify API token for running via CLI or client
HTTP_PROXY(Optional) Custom HTTP proxy endpoint
HTTPS_PROXY(Optional) Custom HTTPS proxy endpoint

โ–ถ๏ธ How to Run

Apify Console

  1. Go to Actor โ†’ Run.
  2. Paste JSON input containing either queries or urls.
  3. Enable proxy under Proxy tab (recommended).
  4. Click Start and monitor logs.

Apify CLI

$apify call username~amazon-search-scraper -p input.json

Apify Client (Python)

See Quick Start (API) example above.


โฐ Scheduling & Webhooks

  • Use the Schedule tab in the Apify Console to run daily/weekly.
  • Add a Webhook under the Webhooks tab to trigger external automation (e.g., send results to Slack or Google Sheets).

๐Ÿž Logs & Troubleshooting

IssueCauseFix
Empty resultsAmazon blocked requestEnable Apify Proxy or rotate proxies
Timeout errorsNetwork latency or blockingIncrease REQUEST_TIMEOUT or reduce concurrency
Missing product detailsPage layout changedReport issue or rerun after 24h

๐Ÿ”’ Permissions & Storage

  • Uses the default Dataset for structured data.
  • Temporary files saved in Actor local storage.
  • Secure credentials (tokens, proxies) should be stored as Secrets in the Apify Console.

๐Ÿ†• Changelog / Versioning

  • v1.1.0 โ€” Added support for scraping from direct product URLs.
  • v1.0.0 โ€” Initial public release.

๐Ÿ“Œ Notes / TODOs

  • TODO: Confirm supported Amazon domains (currently assumes amazon.com).
  • TODO: Add optional input for country_code or domain selection.

๐ŸŒ Proxy Configuration

Because this Actor sends requests to Amazon, proxy use is highly recommended.

  1. Open the Run page โ†’ Proxy tab.
  2. Check Use Apify Proxy.
  3. Select a proxy group (e.g., RESIDENTIAL or SHADER).

Custom Proxy Configuration

If you prefer your own proxy, go to Actor โ†’ Settings โ†’ Environment variables and set:

HTTP_PROXY=http://<PROXY_USER>:<PROXY_PASS>@<HOST>:<PORT>
HTTPS_PROXY=http://<PROXY_USER>:<PROXY_PASS>@<HOST>:<PORT>

๐Ÿ”’ Always store proxy credentials securely as Secrets.

TODO

Implement proxy rotation per request for improved anti-blocking resilience.


๐Ÿ“š References


๐Ÿง What I inferred from main.py

  • Actor collects Amazon product listings via search keywords and direct product URLs.
  • Network activity detected โ€” proxy section included.
  • Outputs JSON list of structured product data.
  • Domain is assumed to be amazon.com โ€” marked as TODO for domain parameterization.