Amazon Search Scraper avatar
Amazon Search Scraper

Pricing

$9.00/month + usage

Go to Apify Store
Amazon Search Scraper

Amazon Search Scraper

Developed by

Neuro Scraper

Neuro Scraper

Maintained by Community

⚑ Instantly discover Amazon’s best-selling products with one click! πŸ›’ This smart actor fetches real-time prices, ratings, and deals β€” giving you insights in seconds. Trusted by pros for accuracy, speed, and reliability. Run it now and find hidden gems before your competitors do! πŸš€

0.0 (0)

Pricing

$9.00/month + usage

0

1

1

Last modified

a day ago

πŸš€ Amazon Search Scraper

One-line tagline: Instantly extract product listings and product-page metadata from Amazon search queries β€” fast, secure, and ready for business use.


πŸ“– Summary

Amazon Search Scraper retrieves product data from Amazon search results and product pages and returns clean, business-ready JSON records for analysis. Designed for speed and reliability, it helps teams discover product details, prices, reviews, images, and availability for competitive research, analytics, and monitoring.


πŸ’‘ Use cases / When to use

  • Competitive price monitoring and alerting
  • Market/product research and sourcing
  • Creating product catalogs and feeds
  • Gathering product images and descriptions for analytics
  • Quickly prototyping e-commerce dashboards

⚑ Quick Start β€” Console (one-click)

  1. Open this Actor in Apify Console.
  2. Fill the Queries input (single keyword or array of keywords).
  3. (Optional) Enable Proxy Configuration if scraping at scale.
  4. Click Run. Results appear in the default dataset/OUTPUT in seconds.

Friendly microcopy: β€œPlug in a search term, click Run, and get structured product data instantly.”


βš™οΈ Quick Start (CLI + API)

CLI

# Run an actor with JSON input via apify-cli
apify run --actor <your-actor-id> --input input.example.json

Python (apify-client)

from apify_client import ApifyClient
client = ApifyClient('<APIFY_TOKEN>')
run = client.actor('your-user/amazon-search-scraper').call(run_input={"queries": ["wireless earbuds"]})
print('Started run:', run['id'])

πŸ“ Inputs (fields & schema)

Console JSON input example (see input.example.json file):

{
"queries": ["wireless earbuds", "gaming mouse"],
"headless": true,
"requestDelay": [1.0, 2.0]
}

Fields

  • queries β€” string or array β€” required β€” Search keywords or Amazon product URLs. The actor accepts either a search keyword (will run site search) or a direct Amazon product URL for product-page scraping.
  • headless β€” boolean β€” optional β€” Run browser in headless mode (default: true).
  • requestDelay β€” array [min, max] β€” optional β€” Delay range (seconds) between product-page requests to reduce rate.

βš™οΈ Configuration

πŸ”‘ NameπŸ“ Type❓ Requiredβš™οΈ DefaultπŸ“Œ Example🧠 Notes
queriesstring/arrayβœ… Yesβ€”["wireless earbuds"]Search terms or product URLs
headlessbooleanβš™οΈ OptionaltruefalseTurn off to debug visually
requestDelayarray (min,max)βš™οΈ Optional[1.0, 2.0][0.5, 1.0]Avoids aggressive scraping
proxyConfigurationobjectβš™οΈ Optional{}{"useApifyProxy": true}Use residential proxies for scale

Example Console setup: paste "wireless earbuds" into queries and press Run Actor.


πŸ“„ Outputs (Dataset / KV examples)

Each dataset item is a JSON object with attributes similar to:

{
"asin": "B09EXAMPLE",
"title": "Wireless Earbuds XYZ",
"brand_name": "BrandCo",
"url": "https://www.amazon.com/...]",
"price": "$59.99",
"currency": "$",
"thumbnail": "https://...jpg",
"images": ["https://...jpg", "https://...jpg"],
"stars": 4.3,
"review_count": "1.2k",
"availability": "In Stock",
"description": "Key bullet points...",
"categories": "Electronics > Audio",
"search_keyword": "wireless earbuds"
}

Note: The actor pushes results to the default dataset and also returns the full result array in the key-value store under the standard OUTPUT key.


πŸ”‘ Environment Variables

  • <APIFY_TOKEN> β€” required to call the Actor programmatically via API.
  • <PROXY_USER:PASS@HOST:PORT> β€” placeholder for custom proxy credentials.

Security note: Store secrets in Apify Console Secrets β€” do not paste them into input fields.


▢️ How to Run

Console

  1. Go to the Actor page in Apify Console.
  2. Paste your queries (single string or array) into the Input field.
  3. (Optional) Configure proxies under the Proxy Configuration editor.
  4. Click Run.

CLI

$apify run --actor your-user/amazon-search-scraper --input input.example.json

API (Python)

See Quick Start (above) β€” use client.actor(...).call(run_input=...) and read the returned run ID.


⏰ Scheduling & Webhooks

  • Use Apify Console scheduling to run this Actor at any interval (hourly, daily, weekly).
  • Configure webhooks on run completion to forward JSON output to your endpoint for real-time processing.

πŸ•ΎοΈ Logs & Troubleshooting

  • Check the Console logs for step-by-step run info and any per-item warnings.

  • Common issues:

    • No results: verify that queries are valid and spelled correctly.
    • Request timeouts: increase requestDelay or enable Proxy Configuration.
    • Selector changes on Amazon: refresh the run and adjust queries β€” the actor is resilient but web UIs change frequently.

πŸ”’ Permissions & Storage Notes

  • This Actor collects publicly visible product information only. It does not perform account actions.
  • Results are stored in Apify Datasets/Key-Value stores in your account and follow Apify’s standard retention and access controls.

πŸ”Ÿ Changelog / Versioning

  • v0.1.0 β€” Initial public release: search + product-page scraping, structured dataset output.

πŸ–Œ Notes / TODOs

  • TODO: Consider adding a CLI flag / input for limiting the number of product pages per query (reason: some queries return many results).
  • TODO: Add optional CSV export in output settings (reason: convenient for BI ingestion).

🌍 Proxy Configuration

If you will run many searches or large-scale scraping, configure Apify Proxy or custom proxies.

Enable Apify Proxy (Console):

  • In the Actor run form, open Proxy configuration and enable Use Apify Proxy (choose RESIDENTIAL for best results).

Custom proxy example (as secret):

  • Use <PROXY_USER:PASS@HOST:PORT> format and store as a Console Secret. Reference it in the Proxy Configuration editor.

Environment variables (examples)

HTTP_PROXY=<PROXY_USER:PASS@HOST:PORT>
HTTPS_PROXY=<PROXY_USER:PASS@HOST:PORT>

Reminder: Store proxy credentials in Secrets and do not paste them into public inputs.

TODO: Consider proxy rotation for large-scale scraping.


πŸ“š References

  1. Apify Actor README guidelines β€” https://docs.apify.com/console/actors/README
  2. Apify Input/Output schemas β€” https://docs.apify.com/platform/input-output
  3. Apify CLI & API usage β€” https://docs.apify.com/console/actors/run

πŸ€” What I inferred from main.py

  • The Actor accepts queries (keywords or Amazon URLs) and uses an automated browser to fetch search results and product pages.
  • It extracts product metadata, images, prices, ratings, and availability and returns structured JSON items.
  • It respects throttling delays and can be configured to use proxies for scale.
  • Results are pushed to the default dataset and the key-value store under OUTPUT.


input.example.json

{
"queries": [
"wireless earbuds",
"gaming mouse"
],
"headless": true,
"requestDelay": [1.0, 2.0]
}

CONFIG.md (optional)

Quick config notes

  • Secrets: Add <APIFY_TOKEN> and any proxy credentials to Console Secrets.
  • Scaling: For repeated large runs, enable Apify Proxy (RESIDENTIAL) and consider running with scheduling + webhooks to automate downstream processing.

Suggested settings in Console

  • Proxy configuration: use Apify Proxy β†’ RESIDENTIAL
  • Dataset retention: enable automatic export to your storage of choice

Generated by: Neuro Scraper