Amazon Scraper — ASINs, Prices, Rankings & Sponsored Flags
Pricing
from $20.00 / 1,000 results
Amazon Scraper — ASINs, Prices, Rankings & Sponsored Flags
Extract Amazon search results across 23 marketplaces (US, UK, IN, DE, JP, AU, CA, FR, IT, ES, AE, SA): ASINs, titles, prices, sponsored flags, and search rank positions. Up to 10K products per run with auto-pagination. Export CSV/JSON/Excel. No SP-API or affiliate credentials needed.
Pricing
from $20.00 / 1,000 results
Rating
0.0
(0)
Developer
Scrapeify
Actor stats
1
Bookmarked
2
Total users
1
Monthly active users
8 hours ago
Last modified
Categories
Share
Amazon Search Scraper — Extract ASINs, Prices, Rankings & Sponsored Flags at Scale
Extract structured product data from Amazon search results across any global marketplace. The Scrapeify Amazon Scraper retrieves ASINs, titles, prices, product URLs, image URLs, sponsored flags, page numbers, and search positions for any keyword — no Amazon SP-API credentials required. Supports automatic pagination up to 10,000 products per run, multi-region storefronts, and proxy-aware HTTP with browser-grade TLS fingerprinting.
Built for price intelligence, catalog expansion, competitive analysis, and LLM-ready product data feeds.
Features
| Capability | Detail |
|---|---|
| Multi-marketplace | US, UK, IN, DE, JP, AU, CA, FR, IT, ES, AE, SA — resolved from short country codes |
| Scale | Up to 10,000 unique ASINs per run with automatic pagination |
| Sponsored detection | Heuristic flag from SERP card structure (isSponsored) |
| Stable ordering | position reflects search rank after deduplication |
| Browser-grade TLS | impit library when available; aiohttp as fallback |
| Proxy support | Configure PROXY_URL environment variable for residential routing |
| Advanced hooks | cookieHeader, callQueryApi (tri-state), queryExtraParams, wIndexMainSlot, city note |
| Chunked writes | Dataset pushes in configurable batches for large-run reliability |
| Multiple exports | Dataset (primary), OUTPUT summary, RESULTS_JSON, RESULTS_CSV in KV store |
| Input flexibility | Aliases: k for keyword; numberOfResults / resultsRequired for maxResults; country / region for marketplace |
Use Cases
Price Intelligence & Market Monitoring
Track ASIN prices and search positions daily across marketplaces. Detect competitor price drops, sponsored share shifts, and new entrants entering top-20 slots. Combine with scheduling to build price history tables without managing Amazon API quotas.
E-Commerce Catalog Expansion
Identify high-ranking ASINs for target categories and keywords. Build sourcing lists, gap analyses, and assortment studies across international storefronts by running parallel jobs per country code.
Competitive Intelligence
Map organic vs. sponsored share for brand and category queries. Track which sellers consistently occupy sponsored slots and which organic rankings shift after promotions. Exports feed directly into BI dashboards for weekly competitive reviews.
Lead & Vendor Discovery
Extract product listings, sellers, and brand patterns from niche keyword sweeps. Build procurement leads or outreach targets by scraping supplier categories at scale.
AI & LLM Pipelines
Feed structured ASIN + title + price rows into LLM workflows for product comparison, recommendation generation, and catalog enrichment. Structured JSON eliminates brittle HTML parsing inside agent loops.
RAG & Semantic Search Systems
Index titles and ASINs in vector databases. Attach productUrl and imageUrl as metadata for citation and visual grounding. Scrape detail pages separately for bullet-point content retrieval.
Automation Workflows
Schedule Apify runs on a cron → push results to a data warehouse → trigger price-move alerts via webhooks. The structured OUTPUT summary with fulfilledCompletely and stoppedReason supports idempotent ETL pipelines.
Market Research
Study assortment breadth, brand presence, and keyword density across countries by keyword. Run the same query across US, UK, DE, JP, and IN simultaneously with parallel actor instances.
Why Choose This Actor
- No API keys — extracts public search page data without Amazon SP-API or affiliate credentials
- Marketplace flexibility — single actor covers all major Amazon storefronts via country code resolution
- Production-grade outputs —
fulfilledCompletely,pagesFetched, andstoppedReasonflags for operational monitoring - Chunked Dataset writes — handles 10k-row jobs without API timeout or memory overflow
- Schema consistency — identical column structure across all marketplaces enables multi-region joins
- Cloud-native — runs on Apify's serverless infrastructure with standard storage, logging, and API access
Quick Start
- Open the Scrapeify Amazon Scraper on Apify Console.
- Enter a
keyword(e.g.wireless earbuds), select amarketplace(e.g.US), and setmaxResults(e.g.100). - Click Start and wait for the run to complete.
- Export the Dataset as JSON or CSV from the run page.
- For large-scale runs, configure
PROXY_URLin the Actor environment for residential routing.
Tip: Always pass
marketplaceexplicitly. Omitting it defaults toIN(Amazon India), which may not match your target region.
Input Schema
{"keyword": "wireless earbuds","marketplace": "US","maxResults": 100}
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
keyword | string | Yes | — | Search query. Alias: k. Supports long-tail phrases, brand names, categories. Max 2048 chars. |
marketplace | string | Recommended | IN | Country code: US, UK, IN, DE, JP, AU, CA, FR, IT, ES, AE, SA. Aliases: country, region. |
maxResults | integer | Yes | 25 | Unique ASINs to collect (1–10,000). Aliases: numberOfResults, resultsRequired. |
Advanced inputs (via environment / hook parameters): PROXY_URL, cookieHeader, callQueryApi, queryExtraParams, wIndexMainSlot, city.
Output Schema
Dataset Row (one row per product)
{"position": 1,"asin": "B0CXXXXXYZ","title": "SoundCore Liberty 4 NC Wireless Earbuds — Active Noise Cancellation","price": "$49.99","productUrl": "https://www.amazon.com/dp/B0CXXXXXYZ","imageUrl": "https://m.media-amazon.com/images/I/61XXXXXXX.jpg","isSponsored": false,"page": 1,"searchKeyword": "wireless earbuds","marketplace": "https://www.amazon.com","marketplaceCode": "US"}
| Field | Type | Description |
|---|---|---|
position | integer | Search rank (1-based, post-dedup) |
asin | string | Amazon Standard Identification Number |
title | string | Product title as shown in SERP card |
price | string | Price with currency symbol (e.g. $49.99, £39.99) |
productUrl | string | Direct Amazon product URL |
imageUrl | string | SERP card image URL (CDN) |
isSponsored | boolean | true if detected as sponsored placement |
page | integer | SERP page number |
searchKeyword | string | Input keyword echoed on every row |
marketplace | string | Full storefront base URL |
marketplaceCode | string | Short country code (e.g. US, UK, DE) |
Run Summary (OUTPUT key in default KV store)
{"ok": true,"keyword": "wireless earbuds","maxResults": 100,"returnedResults": 100,"fulfilledCompletely": true,"marketplace": "https://www.amazon.com","marketplaceCode": "US","location": {"note": "Search results follow the selected Amazon storefront, not a postal address.","city": ""},"meta": {"pagesFetched": 5,"stoppedReason": "target_reached","marketplaceCode": "US"},"scrapedAt": "2026-05-07T04:00:00.000Z","savedTo": {"dataset": "Run page → Dataset tab → Export (JSON, CSV, Excel)","keyValueStore": "Default KV store: OUTPUT (summary), RESULTS_JSON, RESULTS_CSV"},"results": null,"note": "Too many rows to embed in OUTPUT; use RESULTS_JSON / RESULTS_CSV or export Dataset."}
| Field | Type | Description |
|---|---|---|
ok | boolean | true if products were returned |
fulfilledCompletely | boolean | true if returnedResults >= maxResults |
meta.pagesFetched | integer | SERP pages scraped |
meta.stoppedReason | string | target_reached, exhausted, or error descriptor |
results | array/null | Embedded when small (null for large runs — use KV or Dataset) |
Additional KV keys: RESULTS_CSV (full CSV string), RESULTS_JSON (full JSON array).
API Examples
cURL
curl "https://api.apify.com/v2/acts/scrapeify~amazon-scraper/runs?token=$APIFY_TOKEN" \-X POST \-H "Content-Type: application/json" \-d '{"keyword": "protein powder","marketplace": "UK","maxResults": 50}'
Python
import osfrom apify_client import ApifyClientclient = ApifyClient(os.environ["APIFY_TOKEN"])run = client.actor("scrapeify/amazon-scraper").call(run_input={"keyword": "protein powder","marketplace": "UK","maxResults": 50,})for product in client.dataset(run["defaultDatasetId"]).iterate_items():print(product["asin"], product["price"], product["isSponsored"])
JavaScript / Node.js
import { ApifyClient } from "apify-client";const client = new ApifyClient({ token: process.env.APIFY_TOKEN });const run = await client.actor("scrapeify/amazon-scraper").call({keyword: "protein powder",marketplace: "UK",maxResults: 50,});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(`Collected ${items.length} products`);
Integration Examples
ChatGPT / Custom GPT Actions
Expose the Apify REST endpoint as a Custom GPT action. Return the first page of Dataset items as JSON so the model can compare products, summarize pricing, or identify sponsored vs organic patterns in natural language.
Claude & Gemini Tool Use
Register an amazon_search tool that calls the actor via API. The structured JSON response — ASINs, prices, sponsored flags — provides grounded product context for recommendation tasks, comparison tables, and price analysis without hallucination.
LangChain
Wrap actor invocation as a LangChain tool. A reducer step can build comparison tables from the JSON array, trigger follow-up detail-page scrapes, or feed data into a vector store for semantic product search.
from langchain.tools import tool@tooldef amazon_search(keyword: str, marketplace: str = "US", max_results: int = 50) -> list:"""Search Amazon and return structured product data."""run = client.actor("scrapeify/amazon-scraper").call(run_input={"keyword": keyword, "marketplace": marketplace, "maxResults": max_results})return client.dataset(run["defaultDatasetId"]).list_items().items
CrewAI & AutoGen
Assign an AmazonResearchAgent that calls this tool on behalf of a product strategy crew. Downstream agents receive structured rows — no HTML parsing, no fragile selectors.
n8n / Make.com / Zapier
Use the HTTP module to trigger a run → poll for completion → iterate Dataset rows → push to Google Sheets, Airtable, or a CRM. The OUTPUT.fulfilledCompletely flag makes pipeline health checks trivial.
Vector Databases (Pinecone, Weaviate, Qdrant)
Embed title as the vector. Store asin, price, productUrl, isSponsored, and marketplaceCode as metadata. Supports semantic product retrieval and faceted filtering in RAG pipelines.
RAG Systems
Index Amazon titles and snippets as retrieval chunks. Attach product URLs as citation sources. Fetch detail pages in a second stage for bullet-point content when full text is needed.
Frequently Asked Questions
1. Do I need Amazon API credentials or an affiliate account? No. The actor extracts data from Amazon's public search pages. No SP-API, MWS, or Associates credentials are required.
2. What marketplaces are supported?
US, UK, IN, DE, JP, AU, CA, FR, IT, ES, AE, and SA. Pass the country code in the marketplace field. Defaults to IN if omitted.
3. What is the maximum number of products per run? 10,000 unique ASINs per run (enforced by input validation).
4. How does pagination work?
The actor continues fetching SERP pages until it reaches maxResults unique ASINs or Amazon returns no further pages.
5. How accurate is the sponsored flag?
isSponsored is a heuristic derived from SERP card structure. Spot-check critical SKUs against the live page for high-stakes audits.
6. What does callQueryApi do?
It enables a mobile JSON path for certain marketplaces. It defaults to true for amazon.in but can be overridden.
7. Why might I get fewer results than maxResults?
Amazon may return fewer results than the cap for niche queries or specific marketplaces. fulfilledCompletely: false and stoppedReason: exhausted indicate this.
8. How do I handle CAPTCHA or empty HTML responses?
Use a residential proxy via PROXY_URL. Session cookies can be passed in cookieHeader when needed.
9. Is price returned as a number or string?
Price is a string (e.g. "$49.99") to preserve currency symbols and regional formatting. Parse to a float in your ETL if needed.
10. Can I scrape multiple marketplaces simultaneously?
Yes — run parallel Apify actor instances with different marketplace values. One marketplace per actor run.
11. Are Buy Box, shipping costs, or stock levels included? No. The actor focuses on SERP card fields. Extend with a product-detail actor for logistics data.
12. How do I run this on a schedule? Use Apify Schedules in the Console or cron-trigger via the Runs API. Combine with webhooks to notify downstream systems.
13. What does wIndexMainSlot control?
A numeric parameter for SERP slot indexing in experimental marketplace layouts. Only adjust if standard parsing fails for a specific storefront.
14. Why is RESULTS_JSON missing some rows?
Key-Value store item size limits apply. For large runs, export the Dataset directly (JSON, CSV, or Excel from the run page).
15. Can I filter by category or brand within a keyword? Filter happens downstream — export all results and apply brand/category filters in your ETL or BI tool.
16. Is the search order preserved in the Dataset?
Yes. position reflects deduped search order. Dataset items are written in scrape order.
17. How do I handle multi-region price comparisons?
Collect runs per marketplace in separate datasets. Join on asin in your warehouse for cross-regional price tables.
18. How current is the data?
Each run reflects a live snapshot of Amazon search results at the time of execution. Timestamps are in OUTPUT.scrapedAt.
19. What are the Amazon ToS implications? You are responsible for ensuring your use case complies with Amazon's Terms of Service and applicable data regulations in your jurisdiction.
20. Does the actor handle rate limits automatically? Yes — the actor includes backoff logic and proxy-aware request flow. Avoid launching many parallel duplicate-keyword runs without spacing.
21. Can I use the output for building an Amazon affiliate site? Your Amazon Associates obligations are separate from this tool. Ensure compliance with Amazon's affiliate program terms independently.
22. What is the city note in the output?
An optional metadata note from input. Amazon search results follow storefront geography, not a postal address.
23. How should I set up idempotent ETL with this actor?
Key on asin + marketplaceCode + scrape timestamp. Log stoppedReason and pagesFetched in your audit table.
24. Does it support Amazon Business (B2B) pages? No — targets standard consumer search pages.
25. What happens if the actor errors out mid-run?
Partial results already written to the Dataset are preserved. The OUTPUT key will have ok: false with an error message.
Best Practices
- Always specify
marketplace— omitting it defaults toIN, which may not match your target region - Residential proxies for production — datacenter IPs may see sparse or blocked results on high-competition queries
- Smoke test first — set
maxResults: 25to validate keyword + marketplace combinations before large runs - Chunk downstream processing — stream Dataset pages rather than loading all 10,000 rows into application memory
- Use ethical session data — only pass
cookieHeadervalues from sessions you own - Schedule strategically — organic vs. sponsored share can shift intraday; treat snapshots as instantaneous
- Monitor parse error rates — Amazon card HTML can shift; alert on runs returning significantly fewer results than expected
- Version your ETL schema — update downstream transformers when new fields appear after actor updates
Performance & Scalability
| Factor | Guidance |
|---|---|
| Latency | Pagination is linear in page count. Expect seconds to minutes depending on maxResults. |
| TLS efficiency | impit reduces TLS fingerprinting overhead vs. naive HTTP clients. |
| Write reliability | Batched Dataset pushes (e.g. 1,000 rows per batch) prevent API timeouts on large inserts. |
| Horizontal scaling | Run parallel actors per marketplace or keyword shard. One run per marketplace per keyword. |
| Memory | Batched writes keep memory predictable at 10k-row scale. |
AI & Automation Workflows
Price alert pipeline: Schedule → scrape → compare to yesterday's snapshot → alert on >5% price move via webhook to Slack.
LLM product comparison: Pull 50 ASINs → pass structured rows to Claude/GPT → generate comparison table in natural language.
Competitor sponsored share tracking: Run weekly → compute isSponsored ratio per brand per keyword → chart in Metabase.
RAG product assistant: Embed titles → store in Pinecone → retrieve top-K products → ground LLM answers with live Amazon links.
Error Handling
| Scenario | Behavior |
|---|---|
Invalid maxResults (e.g. 0 or > 10,000) | Validation error pushed to Dataset; OUTPUT.ok: false |
| CAPTCHA or empty HTML | Error item in Dataset with proxy/cookie suggestion |
| No results for keyword | OUTPUT.ok: false, stoppedReason: exhausted |
| Large run exceeds KV size | RESULTS_JSON / RESULTS_CSV may truncate; use Dataset export |
| Proxy failure | Logged; fallback to direct egress if configured |
Trust & Reliability
Scrapeify builds and maintains this actor for production price intelligence and catalog operations teams. The architecture targets stable SERP extraction with:
- Defensive HTML parsing with marketplace-specific resolution helpers
- Explicit operational metrics (
pagesFetched,stoppedReason,fulfilledCompletely) for SLA monitoring - Multiple export paths (Dataset, RESULTS_JSON, RESULTS_CSV) for BI and engineering handoff
- Clear storage documentation in run
OUTPUTfor autonomous pipeline operation
Related Scrapeify Actors
Explore the full Scrapeify suite — chain these actors together for end-to-end automation pipelines:
| Actor | What it does |
|---|---|
| Instagram Ad Library Scraper | Instagram-only ads from Meta Ad Library |
| Meta Ad Library Scraper | Facebook & Instagram ads with sort options |
| WhatsApp Ad Scraper | Click-to-WhatsApp ad creatives |
| YouTube Video Downloader | Videos & audio to Apify Key-Value Store |
| Meta Brand & Page ID Finder | Resolve brand names to numeric Page IDs |
| Google Maps Scraper | Local business leads, reviews, emails, contacts |
| Google News Scraper | Headlines, sources, article URLs (up to 2K) |
Amazon is a trademark of Amazon.com, Inc. This actor is not affiliated with or endorsed by Amazon.