# Amazon Scraper — ASINs, Prices, Rankings & Sponsored Flags (`scrapeify/amazon-scraper`) Actor

Extract Amazon search results across 23 marketplaces (US, UK, IN, DE, JP, AU, CA, FR, IT, ES, AE, SA): ASINs, titles, prices, sponsored flags, and search rank positions. Up to 10K products per run with auto-pagination. Export CSV/JSON/Excel. No SP-API or affiliate credentials needed.

- **URL**: https://apify.com/scrapeify/amazon-scraper.md
- **Developed by:** [Scrapeify](https://apify.com/scrapeify) (community)
- **Categories:** Lead generation, Agents, AI
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 1 bookmarks
- **User rating**: No ratings yet

## Pricing

from $20.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Amazon Search Scraper — Extract ASINs, Prices, Rankings & Sponsored Flags at Scale

Extract structured product data from **Amazon search results** across any global marketplace. The Scrapeify Amazon Scraper retrieves **ASINs, titles, prices, product URLs, image URLs, sponsored flags, page numbers, and search positions** for any keyword — no Amazon SP-API credentials required. Supports automatic pagination up to **10,000 products per run**, multi-region storefronts, and proxy-aware HTTP with browser-grade TLS fingerprinting.

Built for price intelligence, catalog expansion, competitive analysis, and LLM-ready product data feeds.

---

### Features

| Capability | Detail |
|---|---|
| **Multi-marketplace** | US, UK, IN, DE, JP, AU, CA, FR, IT, ES, AE, SA — resolved from short country codes |
| **Scale** | Up to 10,000 unique ASINs per run with automatic pagination |
| **Sponsored detection** | Heuristic flag from SERP card structure (`isSponsored`) |
| **Stable ordering** | `position` reflects search rank after deduplication |
| **Browser-grade TLS** | `impit` library when available; `aiohttp` as fallback |
| **Proxy support** | Configure `PROXY_URL` environment variable for residential routing |
| **Advanced hooks** | `cookieHeader`, `callQueryApi` (tri-state), `queryExtraParams`, `wIndexMainSlot`, `city` note |
| **Chunked writes** | Dataset pushes in configurable batches for large-run reliability |
| **Multiple exports** | Dataset (primary), `OUTPUT` summary, `RESULTS_JSON`, `RESULTS_CSV` in KV store |
| **Input flexibility** | Aliases: `k` for keyword; `numberOfResults` / `resultsRequired` for maxResults; `country` / `region` for marketplace |

---

### Use Cases

#### Price Intelligence & Market Monitoring
Track ASIN prices and search positions daily across marketplaces. Detect competitor price drops, sponsored share shifts, and new entrants entering top-20 slots. Combine with scheduling to build price history tables without managing Amazon API quotas.

#### E-Commerce Catalog Expansion
Identify high-ranking ASINs for target categories and keywords. Build sourcing lists, gap analyses, and assortment studies across international storefronts by running parallel jobs per country code.

#### Competitive Intelligence
Map organic vs. sponsored share for brand and category queries. Track which sellers consistently occupy sponsored slots and which organic rankings shift after promotions. Exports feed directly into BI dashboards for weekly competitive reviews.

#### Lead & Vendor Discovery
Extract product listings, sellers, and brand patterns from niche keyword sweeps. Build procurement leads or outreach targets by scraping supplier categories at scale.

#### AI & LLM Pipelines
Feed structured ASIN + title + price rows into LLM workflows for product comparison, recommendation generation, and catalog enrichment. Structured JSON eliminates brittle HTML parsing inside agent loops.

#### RAG & Semantic Search Systems
Index titles and ASINs in vector databases. Attach `productUrl` and `imageUrl` as metadata for citation and visual grounding. Scrape detail pages separately for bullet-point content retrieval.

#### Automation Workflows
Schedule Apify runs on a cron → push results to a data warehouse → trigger price-move alerts via webhooks. The structured `OUTPUT` summary with `fulfilledCompletely` and `stoppedReason` supports idempotent ETL pipelines.

#### Market Research
Study assortment breadth, brand presence, and keyword density across countries by keyword. Run the same query across US, UK, DE, JP, and IN simultaneously with parallel actor instances.

---

### Why Choose This Actor

- **No API keys** — extracts public search page data without Amazon SP-API or affiliate credentials
- **Marketplace flexibility** — single actor covers all major Amazon storefronts via country code resolution
- **Production-grade outputs** — `fulfilledCompletely`, `pagesFetched`, and `stoppedReason` flags for operational monitoring
- **Chunked Dataset writes** — handles 10k-row jobs without API timeout or memory overflow
- **Schema consistency** — identical column structure across all marketplaces enables multi-region joins
- **Cloud-native** — runs on Apify's serverless infrastructure with standard storage, logging, and API access

---

### Quick Start

1. Open the Scrapeify **Amazon Scraper** on Apify Console.
2. Enter a **`keyword`** (e.g. `wireless earbuds`), select a **`marketplace`** (e.g. `US`), and set **`maxResults`** (e.g. `100`).
3. Click **Start** and wait for the run to complete.
4. Export the **Dataset** as JSON or CSV from the run page.
5. For large-scale runs, configure `PROXY_URL` in the Actor environment for residential routing.

> **Tip:** Always pass `marketplace` explicitly. Omitting it defaults to `IN` (Amazon India), which may not match your target region.

---

### Input Schema

```json
{
  "keyword": "wireless earbuds",
  "marketplace": "US",
  "maxResults": 100
}
````

| Field | Type | Required | Default | Description |
|---|---|---|---|---|
| `keyword` | string | Yes | — | Search query. Alias: `k`. Supports long-tail phrases, brand names, categories. Max 2048 chars. |
| `marketplace` | string | Recommended | `IN` | Country code: `US`, `UK`, `IN`, `DE`, `JP`, `AU`, `CA`, `FR`, `IT`, `ES`, `AE`, `SA`. Aliases: `country`, `region`. |
| `maxResults` | integer | Yes | 25 | Unique ASINs to collect (1–10,000). Aliases: `numberOfResults`, `resultsRequired`. |

**Advanced inputs** (via environment / hook parameters): `PROXY_URL`, `cookieHeader`, `callQueryApi`, `queryExtraParams`, `wIndexMainSlot`, `city`.

***

### Output Schema

#### Dataset Row (one row per product)

```json
{
  "position": 1,
  "asin": "B0CXXXXXYZ",
  "title": "SoundCore Liberty 4 NC Wireless Earbuds — Active Noise Cancellation",
  "price": "$49.99",
  "productUrl": "https://www.amazon.com/dp/B0CXXXXXYZ",
  "imageUrl": "https://m.media-amazon.com/images/I/61XXXXXXX.jpg",
  "isSponsored": false,
  "page": 1,
  "searchKeyword": "wireless earbuds",
  "marketplace": "https://www.amazon.com",
  "marketplaceCode": "US"
}
```

| Field | Type | Description |
|---|---|---|
| `position` | integer | Search rank (1-based, post-dedup) |
| `asin` | string | Amazon Standard Identification Number |
| `title` | string | Product title as shown in SERP card |
| `price` | string | Price with currency symbol (e.g. `$49.99`, `£39.99`) |
| `productUrl` | string | Direct Amazon product URL |
| `imageUrl` | string | SERP card image URL (CDN) |
| `isSponsored` | boolean | `true` if detected as sponsored placement |
| `page` | integer | SERP page number |
| `searchKeyword` | string | Input keyword echoed on every row |
| `marketplace` | string | Full storefront base URL |
| `marketplaceCode` | string | Short country code (e.g. `US`, `UK`, `DE`) |

#### Run Summary (`OUTPUT` key in default KV store)

```json
{
  "ok": true,
  "keyword": "wireless earbuds",
  "maxResults": 100,
  "returnedResults": 100,
  "fulfilledCompletely": true,
  "marketplace": "https://www.amazon.com",
  "marketplaceCode": "US",
  "location": {
    "note": "Search results follow the selected Amazon storefront, not a postal address.",
    "city": ""
  },
  "meta": {
    "pagesFetched": 5,
    "stoppedReason": "target_reached",
    "marketplaceCode": "US"
  },
  "scrapedAt": "2026-05-07T04:00:00.000Z",
  "savedTo": {
    "dataset": "Run page → Dataset tab → Export (JSON, CSV, Excel)",
    "keyValueStore": "Default KV store: OUTPUT (summary), RESULTS_JSON, RESULTS_CSV"
  },
  "results": null,
  "note": "Too many rows to embed in OUTPUT; use RESULTS_JSON / RESULTS_CSV or export Dataset."
}
```

| Field | Type | Description |
|---|---|---|
| `ok` | boolean | `true` if products were returned |
| `fulfilledCompletely` | boolean | `true` if `returnedResults >= maxResults` |
| `meta.pagesFetched` | integer | SERP pages scraped |
| `meta.stoppedReason` | string | `target_reached`, `exhausted`, or error descriptor |
| `results` | array/null | Embedded when small (`null` for large runs — use KV or Dataset) |

**Additional KV keys:** `RESULTS_CSV` (full CSV string), `RESULTS_JSON` (full JSON array).

***

### API Examples

#### cURL

```bash
curl "https://api.apify.com/v2/acts/scrapeify~amazon-scraper/runs?token=$APIFY_TOKEN" \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "keyword": "protein powder",
    "marketplace": "UK",
    "maxResults": 50
  }'
```

#### Python

```python
import os
from apify_client import ApifyClient

client = ApifyClient(os.environ["APIFY_TOKEN"])

run = client.actor("scrapeify/amazon-scraper").call(
    run_input={
        "keyword": "protein powder",
        "marketplace": "UK",
        "maxResults": 50,
    }
)

for product in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(product["asin"], product["price"], product["isSponsored"])
```

#### JavaScript / Node.js

```javascript
import { ApifyClient } from "apify-client";

const client = new ApifyClient({ token: process.env.APIFY_TOKEN });

const run = await client.actor("scrapeify/amazon-scraper").call({
  keyword: "protein powder",
  marketplace: "UK",
  maxResults: 50,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(`Collected ${items.length} products`);
```

***

### Integration Examples

#### ChatGPT / Custom GPT Actions

Expose the Apify REST endpoint as a Custom GPT action. Return the first page of Dataset items as JSON so the model can compare products, summarize pricing, or identify sponsored vs organic patterns in natural language.

#### Claude & Gemini Tool Use

Register an `amazon_search` tool that calls the actor via API. The structured JSON response — ASINs, prices, sponsored flags — provides grounded product context for recommendation tasks, comparison tables, and price analysis without hallucination.

#### LangChain

Wrap actor invocation as a LangChain tool. A reducer step can build comparison tables from the JSON array, trigger follow-up detail-page scrapes, or feed data into a vector store for semantic product search.

```python
from langchain.tools import tool

@tool
def amazon_search(keyword: str, marketplace: str = "US", max_results: int = 50) -> list:
    """Search Amazon and return structured product data."""
    run = client.actor("scrapeify/amazon-scraper").call(
        run_input={"keyword": keyword, "marketplace": marketplace, "maxResults": max_results}
    )
    return client.dataset(run["defaultDatasetId"]).list_items().items
```

#### CrewAI & AutoGen

Assign an `AmazonResearchAgent` that calls this tool on behalf of a product strategy crew. Downstream agents receive structured rows — no HTML parsing, no fragile selectors.

#### n8n / Make.com / Zapier

Use the HTTP module to trigger a run → poll for completion → iterate Dataset rows → push to Google Sheets, Airtable, or a CRM. The `OUTPUT.fulfilledCompletely` flag makes pipeline health checks trivial.

#### Vector Databases (Pinecone, Weaviate, Qdrant)

Embed `title` as the vector. Store `asin`, `price`, `productUrl`, `isSponsored`, and `marketplaceCode` as metadata. Supports semantic product retrieval and faceted filtering in RAG pipelines.

#### RAG Systems

Index Amazon titles and snippets as retrieval chunks. Attach product URLs as citation sources. Fetch detail pages in a second stage for bullet-point content when full text is needed.

***

### Frequently Asked Questions

**1. Do I need Amazon API credentials or an affiliate account?**
No. The actor extracts data from Amazon's public search pages. No SP-API, MWS, or Associates credentials are required.

**2. What marketplaces are supported?**
US, UK, IN, DE, JP, AU, CA, FR, IT, ES, AE, and SA. Pass the country code in the `marketplace` field. Defaults to `IN` if omitted.

**3. What is the maximum number of products per run?**
10,000 unique ASINs per run (enforced by input validation).

**4. How does pagination work?**
The actor continues fetching SERP pages until it reaches `maxResults` unique ASINs or Amazon returns no further pages.

**5. How accurate is the sponsored flag?**
`isSponsored` is a heuristic derived from SERP card structure. Spot-check critical SKUs against the live page for high-stakes audits.

**6. What does `callQueryApi` do?**
It enables a mobile JSON path for certain marketplaces. It defaults to `true` for `amazon.in` but can be overridden.

**7. Why might I get fewer results than `maxResults`?**
Amazon may return fewer results than the cap for niche queries or specific marketplaces. `fulfilledCompletely: false` and `stoppedReason: exhausted` indicate this.

**8. How do I handle CAPTCHA or empty HTML responses?**
Use a residential proxy via `PROXY_URL`. Session cookies can be passed in `cookieHeader` when needed.

**9. Is price returned as a number or string?**
Price is a **string** (e.g. `"$49.99"`) to preserve currency symbols and regional formatting. Parse to a float in your ETL if needed.

**10. Can I scrape multiple marketplaces simultaneously?**
Yes — run parallel Apify actor instances with different `marketplace` values. One marketplace per actor run.

**11. Are Buy Box, shipping costs, or stock levels included?**
No. The actor focuses on SERP card fields. Extend with a product-detail actor for logistics data.

**12. How do I run this on a schedule?**
Use Apify Schedules in the Console or cron-trigger via the Runs API. Combine with webhooks to notify downstream systems.

**13. What does `wIndexMainSlot` control?**
A numeric parameter for SERP slot indexing in experimental marketplace layouts. Only adjust if standard parsing fails for a specific storefront.

**14. Why is `RESULTS_JSON` missing some rows?**
Key-Value store item size limits apply. For large runs, export the Dataset directly (JSON, CSV, or Excel from the run page).

**15. Can I filter by category or brand within a keyword?**
Filter happens downstream — export all results and apply brand/category filters in your ETL or BI tool.

**16. Is the search order preserved in the Dataset?**
Yes. `position` reflects deduped search order. Dataset items are written in scrape order.

**17. How do I handle multi-region price comparisons?**
Collect runs per marketplace in separate datasets. Join on `asin` in your warehouse for cross-regional price tables.

**18. How current is the data?**
Each run reflects a live snapshot of Amazon search results at the time of execution. Timestamps are in `OUTPUT.scrapedAt`.

**19. What are the Amazon ToS implications?**
You are responsible for ensuring your use case complies with Amazon's Terms of Service and applicable data regulations in your jurisdiction.

**20. Does the actor handle rate limits automatically?**
Yes — the actor includes backoff logic and proxy-aware request flow. Avoid launching many parallel duplicate-keyword runs without spacing.

**21. Can I use the output for building an Amazon affiliate site?**
Your Amazon Associates obligations are separate from this tool. Ensure compliance with Amazon's affiliate program terms independently.

**22. What is the `city` note in the output?**
An optional metadata note from input. Amazon search results follow storefront geography, not a postal address.

**23. How should I set up idempotent ETL with this actor?**
Key on `asin` + `marketplaceCode` + scrape timestamp. Log `stoppedReason` and `pagesFetched` in your audit table.

**24. Does it support Amazon Business (B2B) pages?**
No — targets standard consumer search pages.

**25. What happens if the actor errors out mid-run?**
Partial results already written to the Dataset are preserved. The `OUTPUT` key will have `ok: false` with an error message.

***

### Best Practices

- **Always specify `marketplace`** — omitting it defaults to `IN`, which may not match your target region
- **Residential proxies for production** — datacenter IPs may see sparse or blocked results on high-competition queries
- **Smoke test first** — set `maxResults: 25` to validate keyword + marketplace combinations before large runs
- **Chunk downstream processing** — stream Dataset pages rather than loading all 10,000 rows into application memory
- **Use ethical session data** — only pass `cookieHeader` values from sessions you own
- **Schedule strategically** — organic vs. sponsored share can shift intraday; treat snapshots as instantaneous
- **Monitor parse error rates** — Amazon card HTML can shift; alert on runs returning significantly fewer results than expected
- **Version your ETL schema** — update downstream transformers when new fields appear after actor updates

***

### Performance & Scalability

| Factor | Guidance |
|---|---|
| **Latency** | Pagination is linear in page count. Expect seconds to minutes depending on `maxResults`. |
| **TLS efficiency** | `impit` reduces TLS fingerprinting overhead vs. naive HTTP clients. |
| **Write reliability** | Batched Dataset pushes (e.g. 1,000 rows per batch) prevent API timeouts on large inserts. |
| **Horizontal scaling** | Run parallel actors per marketplace or keyword shard. One run per marketplace per keyword. |
| **Memory** | Batched writes keep memory predictable at 10k-row scale. |

***

### AI & Automation Workflows

**Price alert pipeline:** Schedule → scrape → compare to yesterday's snapshot → alert on >5% price move via webhook to Slack.

**LLM product comparison:** Pull 50 ASINs → pass structured rows to Claude/GPT → generate comparison table in natural language.

**Competitor sponsored share tracking:** Run weekly → compute `isSponsored` ratio per brand per keyword → chart in Metabase.

**RAG product assistant:** Embed titles → store in Pinecone → retrieve top-K products → ground LLM answers with live Amazon links.

***

### Error Handling

| Scenario | Behavior |
|---|---|
| Invalid `maxResults` (e.g. 0 or > 10,000) | Validation error pushed to Dataset; `OUTPUT.ok: false` |
| CAPTCHA or empty HTML | Error item in Dataset with proxy/cookie suggestion |
| No results for keyword | `OUTPUT.ok: false`, `stoppedReason: exhausted` |
| Large run exceeds KV size | `RESULTS_JSON` / `RESULTS_CSV` may truncate; use Dataset export |
| Proxy failure | Logged; fallback to direct egress if configured |

***

### Trust & Reliability

Scrapeify builds and maintains this actor for production price intelligence and catalog operations teams. The architecture targets **stable SERP extraction** with:

- Defensive HTML parsing with marketplace-specific resolution helpers
- Explicit operational metrics (`pagesFetched`, `stoppedReason`, `fulfilledCompletely`) for SLA monitoring
- Multiple export paths (Dataset, RESULTS\_JSON, RESULTS\_CSV) for BI and engineering handoff
- Clear storage documentation in run `OUTPUT` for autonomous pipeline operation

***

### Related Scrapeify Actors

Explore the full Scrapeify suite — chain these actors together for end-to-end automation pipelines:

| Actor | What it does |
|---|---|
| [Instagram Ad Library Scraper](https://apify.com/scrapeify/instagram-ad-library-scraper) | Instagram-only ads from Meta Ad Library |
| [Meta Ad Library Scraper](https://apify.com/scrapeify/meta-ad-library-scraper) | Facebook & Instagram ads with sort options |
| [WhatsApp Ad Scraper](https://apify.com/scrapeify/whatsapp-ad-scraper) | Click-to-WhatsApp ad creatives |
| [YouTube Video Downloader](https://apify.com/scrapeify/youtube-video-downloader) | Videos & audio to Apify Key-Value Store |
| [Meta Brand & Page ID Finder](https://apify.com/scrapeify/facebook-page-id-finder) | Resolve brand names to numeric Page IDs |
| [Google Maps Scraper](https://apify.com/scrapeify/google-maps-scraper) | Local business leads, reviews, emails, contacts |
| [Google News Scraper](https://apify.com/scrapeify/google-news-scraper) | Headlines, sources, article URLs (up to 2K) |

***

*Amazon is a trademark of Amazon.com, Inc. This actor is not affiliated with or endorsed by Amazon.*

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@graph": [
    {
      "@type": "SoftwareApplication",
      "name": "Scrapeify Amazon Search Scraper",
      "applicationCategory": "DeveloperApplication",
      "applicationSubCategory": "Web Scraping API",
      "operatingSystem": "Cloud (Apify Platform)",
      "description": "Scrape Amazon search results across 23 marketplaces. Extract ASINs, titles, prices, sponsored flags, and search rank positions. Up to 10,000 products per run. No Amazon API key required.",
      "url": "https://apify.com/scrapeify/amazon-scraper",
      "featureList": [
        "Multi-marketplace support (US, UK, IN, DE, JP, AU, CA, FR, IT, ES, AE, SA)",
        "Up to 10,000 ASINs per run",
        "Sponsored ad detection",
        "Browser-grade TLS via impit",
        "Proxy support",
        "Dataset, CSV, and JSON exports"
      ],
      "offers": {
        "@type": "Offer",
        "category": "SaaS"
      },
      "publisher": {
        "@type": "Organization",
        "name": "Scrapeify"
      }
    },
    {
      "@type": "FAQPage",
      "mainEntity": [
        {
          "@type": "Question",
          "name": "Do I need Amazon API credentials or an affiliate account?",
          "acceptedAnswer": {
            "@type": "Answer",
            "text": "No. The actor extracts data from Amazon's public search pages. No SP-API, MWS, or Associates credentials are required."
          }
        },
        {
          "@type": "Question",
          "name": "What Amazon marketplaces are supported?",
          "acceptedAnswer": {
            "@type": "Answer",
            "text": "US, UK, IN, DE, JP, AU, CA, FR, IT, ES, AE, and SA. Pass the country code in the marketplace field. Defaults to IN if omitted."
          }
        },
        {
          "@type": "Question",
          "name": "What is the maximum number of products per run?",
          "acceptedAnswer": {
            "@type": "Answer",
            "text": "10,000 unique ASINs per run, enforced by input validation. Pagination continues until maxResults is reached or Amazon returns no further pages."
          }
        },
        {
          "@type": "Question",
          "name": "How accurate is the sponsored flag?",
          "acceptedAnswer": {
            "@type": "Answer",
            "text": "isSponsored is a heuristic derived from SERP card structure. Spot-check critical SKUs against the live page for high-stakes audits."
          }
        },
        {
          "@type": "Question",
          "name": "Can I scrape multiple Amazon marketplaces simultaneously?",
          "acceptedAnswer": {
            "@type": "Answer",
            "text": "Yes — run parallel Apify actor instances with different marketplace values. One marketplace per actor run."
          }
        },
        {
          "@type": "Question",
          "name": "How do I handle CAPTCHA or empty HTML responses from Amazon?",
          "acceptedAnswer": {
            "@type": "Answer",
            "text": "Use a residential proxy via the PROXY_URL environment variable. Session cookies can be passed in cookieHeader when needed."
          }
        }
      ]
    }
  ]
}
</script>

# Actor input Schema

## `keyword` (type: `string`):

Product, brand, category, or long-tail Amazon search query, e.g. wireless earbuds, protein powder, running shoes for women.

## `marketplace` (type: `string`):

Amazon country / region code, e.g. US, UK, IN, DE, JP, AU, CA, FR, IT, ES, AE, SA. Defaults to IN if omitted.

## `maxResults` (type: `integer`):

Number of unique Amazon product rows / ASINs to collect. Paginates automatically until this count is reached or listings end.

## `proxyConfiguration` (type: `object`):

Apify Proxy configuration. RESIDENTIAL group is strongly recommended for Amazon — datacenter IPs are blocked by Amazon's anti-bot system. The actor automatically routes through the country matching your marketplace selection.

## `cookieHeader` (type: `string`):

Optional Cookie header value from a logged-in Amazon session for pages that require it. Use only ethically sourced session strings.

## Actor input object example

```json
{
  "keyword": "wireless earbuds",
  "marketplace": "IN",
  "maxResults": 25,
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ]
  }
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "keyword": "wireless earbuds",
    "proxyConfiguration": {
        "useApifyProxy": true,
        "apifyProxyGroups": [
            "RESIDENTIAL"
        ]
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("scrapeify/amazon-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "keyword": "wireless earbuds",
    "proxyConfiguration": {
        "useApifyProxy": True,
        "apifyProxyGroups": ["RESIDENTIAL"],
    },
}

# Run the Actor and wait for it to finish
run = client.actor("scrapeify/amazon-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "keyword": "wireless earbuds",
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ]
  }
}' |
apify call scrapeify/amazon-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=scrapeify/amazon-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Amazon Scraper — ASINs, Prices, Rankings & Sponsored Flags",
        "description": "Extract Amazon search results across 23 marketplaces (US, UK, IN, DE, JP, AU, CA, FR, IT, ES, AE, SA): ASINs, titles, prices, sponsored flags, and search rank positions. Up to 10K products per run with auto-pagination. Export CSV/JSON/Excel. No SP-API or affiliate credentials needed.",
        "version": "2.8",
        "x-build-id": "UVLRaHrJiHP4Xf921"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/scrapeify~amazon-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-scrapeify-amazon-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/scrapeify~amazon-scraper/runs": {
            "post": {
                "operationId": "runs-sync-scrapeify-amazon-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/scrapeify~amazon-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-scrapeify-amazon-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "keyword": {
                        "title": "Amazon search keyword",
                        "type": "string",
                        "description": "Product, brand, category, or long-tail Amazon search query, e.g. wireless earbuds, protein powder, running shoes for women."
                    },
                    "marketplace": {
                        "title": "Amazon marketplace (country code)",
                        "type": "string",
                        "description": "Amazon country / region code, e.g. US, UK, IN, DE, JP, AU, CA, FR, IT, ES, AE, SA. Defaults to IN if omitted.",
                        "default": "IN"
                    },
                    "maxResults": {
                        "title": "Max products to collect",
                        "minimum": 1,
                        "maximum": 10000,
                        "type": "integer",
                        "description": "Number of unique Amazon product rows / ASINs to collect. Paginates automatically until this count is reached or listings end.",
                        "default": 25
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Apify Proxy configuration. RESIDENTIAL group is strongly recommended for Amazon — datacenter IPs are blocked by Amazon's anti-bot system. The actor automatically routes through the country matching your marketplace selection.",
                        "default": {
                            "useApifyProxy": true,
                            "apifyProxyGroups": [
                                "RESIDENTIAL"
                            ]
                        }
                    },
                    "cookieHeader": {
                        "title": "Amazon cookie header (optional)",
                        "type": "string",
                        "description": "Optional Cookie header value from a logged-in Amazon session for pages that require it. Use only ethically sourced session strings."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
