# Amazon Scraper (`dtrungtin/amazon-scraper`) Actor

Extract structured product data from [Amazon.com](https://www.amazon.com) at scale. Provide one or more Amazon search or category URLs and this Actor will crawl through all result pages, visit each product listing, and return a clean dataset with prices, images, reviews, dimensions, and more.

- **URL**: https://apify.com/dtrungtin/amazon-scraper.md
- **Developed by:** [Tin](https://apify.com/dtrungtin) (community)
- **Categories:** E-commerce
- **Stats:** 3 total users, 2 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

$50.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Amazon Product Scraper

**Extract structured product data from [Amazon.com](https://www.amazon.com) at scale.** Provide one or more Amazon search or category URLs and this Actor will crawl through all result pages, visit each product listing, and return a clean dataset with prices, images, reviews, dimensions, and more — ready to download as JSON, CSV, or Excel.

### What does Amazon Product Scraper do?

This Actor uses a headless Chrome browser (Puppeteer) to navigate Amazon search and category pages, scroll through results, and extract detailed product information from each listing page. It handles pagination automatically and respects a configurable item limit to control cost and run time.

### Why use Amazon Product Scraper?

- **Price monitoring** — track price changes across product categories over time
- **Competitor research** — benchmark products, features, and reviews against your own catalog
- **Market analysis** — discover top-selling products, ratings distributions, and category trends
- **Catalog enrichment** — bulk-fetch images, dimensions, descriptions, and ASINs for your own database
- **Retail analytics** — build datasets for machine learning or BI dashboards

### How to use Amazon Product Scraper

1. Click **Try for free** on the Actor page in Apify Store.
2. In the **Input** tab, paste one or more Amazon search or category page URLs into **Start URLs**.
3. Set **Max items** to limit how many product pages are scraped (default: 50).
4. Click **Start** and wait for the run to finish.
5. Open the **Output** tab or click **Export** to download your dataset as JSON, CSV, or Excel.

### Input

Configure the Actor from the **Input** tab or via the Apify API.

| Field | Type | Description | Default |
|-------|------|-------------|---------|
| `startUrls` | array | Amazon search or category URLs to start crawling from | — |
| `maxItems` | integer | Maximum number of product detail pages to scrape | `50` |

**Example input:**

```json
{
  "startUrls": [
    { "url": "https://www.amazon.com/s?k=mechanical+keyboard" }
  ],
  "maxItems": 100
}
````

### Output

Each item in the dataset corresponds to one Amazon product page.

**Example output item:**

```json
{
  "url": "https://www.amazon.com/dp/B0C836NB5X",
  "title": "Calibrite Display Pro HL",
  "categories": ["Electronics", "Accessories"],
  "itemNumber": "B0C836NB5X",
  "asin": "B0C836NB5X",
  "price": 229.99,
  "priceWithCurrency": "$229.99",
  "brandName": "Calibrite",
  "colorName": "White",
  "sizeName": null,
  "material": null,
  "productDimensions": "4.9 x 1.2 x 6.5 inches",
  "modelName": "Display Pro HL",
  "modelNumber": "CCDIS3HL",
  "upc": "850050285032",
  "sizes": [],
  "images": ["https://m.media-amazon.com/images/I/..."],
  "soldText": "200+ bought in past month",
  "aboutThisItem": ["High-accuracy display calibration", "..."],
  "features": [{ "name": "Brand", "value": "Calibrite" }],
  "productDetails": [{ "name": "Item Weight", "value": "3.35 ounces" }],
  "description": "Professional display calibration tool...",
  "reviewRating": 4.6,
  "reviewCount": 312
}
```

You can download the dataset in various formats such as **JSON, HTML, CSV, or Excel** from the Output tab after the run completes.

### Data fields

| Field | Format | Description |
|-------|--------|-------------|
| `url` | text | Product page URL |
| `title` | text | Full product title |
| `categories` | array | Breadcrumb category path |
| `itemNumber` / `asin` | text | Amazon item identifier (ASIN) |
| `price` | number | Numeric price (e.g. `229.99`) |
| `priceWithCurrency` | text | Price with currency symbol (e.g. `"$229.99"`) |
| `brandName` | text | Brand extracted from product overview |
| `colorName` | text | Color variant |
| `sizeName` | text | Size variant |
| `productDimensions` | text | Physical dimensions |
| `material` | text | Material |
| `modelName` / `modelNumber` | text | Manufacturer model info |
| `upc` | text | Universal Product Code |
| `sizes` | array | Available size options |
| `images` | array | High-resolution image URLs |
| `soldText` | text | "X+ bought in past month" social proof text |
| `aboutThisItem` | array | Bullet-point feature list |
| `features` | array | Product overview table (name/value pairs) |
| `productDetails` | array | Technical details table (name/value pairs) |
| `description` | text | Product description text |
| `reviewRating` | number | Average star rating |
| `reviewCount` | integer | Total number of ratings |

### Pricing / Cost estimation

Runs are billed by compute units consumed. A typical run scraping **100 products** takes around 5–10 minutes and uses approximately 1–2 compute units, depending on page complexity and proxy usage.

Apify offers a **free tier** with $5 of monthly platform credits — enough for several hundred product extractions per month at no cost.

### Tips

- **Use category pages** rather than search pages for more consistent pagination and results.
- **Lower `maxItems`** to do a quick test run before committing to a large crawl.
- **Enable Apify Proxy** (Residential, US) in the proxy configuration for best reliability against Amazon's bot detection.
- Image URLs are blocked during crawling to speed up runs — they are still extracted from the page's embedded JSON data.

### FAQ and support

**Is scraping Amazon legal?**
Scraping publicly available product data for personal research, price monitoring, or analysis is generally accepted. Do not scrape personal user data, bypass paywalls, or violate [Amazon's Conditions of Use](https://www.amazon.com/gp/help/customer/display.html?nodeId=508088). Always comply with applicable laws in your jurisdiction.

**Some fields are empty — why?**
Not every product has every field (e.g. UPC, material, sizes). The Actor returns `null` for missing values rather than omitting the field.

**The Actor returned fewer items than `maxItems`.**
The search results page may have fewer products than the configured limit, or some pages may have been blocked. Try enabling a residential proxy configuration.

For bugs or feature requests, open an issue in the **Issues** tab on the Actor page. Custom scraping solutions are also available — reach out via the Apify contact form.

- [Documentation](https://crawlee.dev/api/playwright-crawler/class/PlaywrightCrawler) and [examples](https://crawlee.dev/docs/examples/playwright-crawler)
- [Node.js tutorials](https://docs.apify.com/academy/node-js) in Academy
- [How to scale Puppeteer and Playwright](https://blog.apify.com/how-to-scale-puppeteer-and-playwright/)
- [Video guide on getting data using Apify API](https://www.youtube.com/watch?v=ViYYDHSBAKM)
- [Integration with Make](https://apify.com/integrations), GitHub, Zapier, Google Drive, and other apps
- A short guide on how to create Actors using code templates:

[web scraper template](https://www.youtube.com/watch?v=u-i-Korzf8w)

### Supported countries

The Actor automatically detects the Amazon domain from the start URL and scrapes product data from that marketplace. Simply use the appropriate Amazon domain in your start URLs:

| Country | Domain |
|---------|--------|
| 🇺🇸 United States | [amazon.com](https://www.amazon.com) |
| 🇬🇧 United Kingdom | [amazon.co.uk](https://www.amazon.co.uk) |
| 🇩🇪 Germany | [amazon.de](https://www.amazon.de) |
| 🇫🇷 France | [amazon.fr](https://www.amazon.fr) |
| 🇮🇹 Italy | [amazon.it](https://www.amazon.it) |
| 🇪🇸 Spain | [amazon.es](https://www.amazon.es) |
| 🇨🇦 Canada | [amazon.ca](https://www.amazon.ca) |
| 🇯🇵 Japan | [amazon.co.jp](https://www.amazon.co.jp) |
| 🇦🇺 Australia | [amazon.com.au](https://www.amazon.com.au) |
| 🇮🇳 India | [amazon.in](https://www.amazon.in) |
| 🇧🇷 Brazil | [amazon.com.br](https://www.amazon.com.br) |
| 🇲🇽 Mexico | [amazon.com.mx](https://www.amazon.com.mx) |
| 🇳🇱 Netherlands | [amazon.nl](https://www.amazon.nl) |
| 🇸🇦 Saudi Arabia | [amazon.sa](https://www.amazon.sa) |
| 🇦🇪 UAE | [amazon.ae](https://www.amazon.ae) |
| 🇸🇬 Singapore | [amazon.sg](https://www.amazon.sg) |

#### Compute units consumption

Keep in mind that it is much more efficient to run one longer scrape (at least one minute) than more shorter ones because of the startup time.

The average consumption is **1 Compute unit for 1000 actor pages** scraped

#### Epilogue

Thank you for trying my actor. I will be very glad for a feedback that you can send to my email `dtrungtin@gmail.com`. If you find any bug, please create an issue on the [Github page](https://github.com/dtrungtin/actor-allrecipes-scraper).

# Actor input Schema

## `startUrls` (type: `array`):

List of URLs that will be scraped or crawled instead of search text.

## `maxItems` (type: `integer`):

Limit of detail/product pages to be scraped

## Actor input object example

```json
{
  "startUrls": [
    {
      "url": "https://www.amazon.com/s?k=pc+gaming+keyboards&crid=B4NA86GVEE9M&sprefix=%2Caps%2C320&ref=nb_sb_ss_recent_1_0_recent"
    }
  ],
  "maxItems": 10
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        {
            "url": "https://www.amazon.com/s?k=pc+gaming+keyboards&crid=B4NA86GVEE9M&sprefix=%2Caps%2C320&ref=nb_sb_ss_recent_1_0_recent"
        }
    ],
    "maxItems": 10
};

// Run the Actor and wait for it to finish
const run = await client.actor("dtrungtin/amazon-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "startUrls": [{ "url": "https://www.amazon.com/s?k=pc+gaming+keyboards&crid=B4NA86GVEE9M&sprefix=%2Caps%2C320&ref=nb_sb_ss_recent_1_0_recent" }],
    "maxItems": 10,
}

# Run the Actor and wait for it to finish
run = client.actor("dtrungtin/amazon-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    {
      "url": "https://www.amazon.com/s?k=pc+gaming+keyboards&crid=B4NA86GVEE9M&sprefix=%2Caps%2C320&ref=nb_sb_ss_recent_1_0_recent"
    }
  ],
  "maxItems": 10
}' |
apify call dtrungtin/amazon-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=dtrungtin/amazon-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Amazon Scraper",
        "description": "Extract structured product data from [Amazon.com](https://www.amazon.com) at scale. Provide one or more Amazon search or category URLs and this Actor will crawl through all result pages, visit each product listing, and return a clean dataset with prices, images, reviews, dimensions, and more.",
        "version": "0.0",
        "x-build-id": "2VJJzHaMPlcpkLY7E"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/dtrungtin~amazon-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-dtrungtin-amazon-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/dtrungtin~amazon-scraper/runs": {
            "post": {
                "operationId": "runs-sync-dtrungtin-amazon-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/dtrungtin~amazon-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-dtrungtin-amazon-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "startUrls": {
                        "title": "Start URLs",
                        "type": "array",
                        "description": "List of URLs that will be scraped or crawled instead of search text.",
                        "default": [
                            {
                                "url": "https://www.amazon.com/s?k=pc+gaming+keyboards&crid=B4NA86GVEE9M&sprefix=%2Caps%2C320&ref=nb_sb_ss_recent_1_0_recent"
                            }
                        ],
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "maxItems": {
                        "title": "Max items",
                        "minimum": 1,
                        "maximum": 500,
                        "type": "integer",
                        "description": "Limit of detail/product pages to be scraped",
                        "default": 10
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
