# AWS Marketplace Scraper (`crawlerbros/aws-marketplace-scraper`) Actor

Scrape AWS Marketplace product listings with search by keyword, browse by category, fetch specific product URLs, or explore free products. Extracts vendor name, pricing model, delivery methods, categories, ratings, reviews, descriptions, and logos.

- **URL**: https://apify.com/crawlerbros/aws-marketplace-scraper.md
- **Developed by:** [Crawler Bros](https://apify.com/crawlerbros) (community)
- **Categories:** E-commerce, Developer tools, Automation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 4 bookmarks
- **User rating**: 5.00 out of 5 stars

## Pricing

from $3.00 / 1,000 results

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## AWS Marketplace Scraper

Extract detailed product listings from the [AWS Marketplace](https://aws.amazon.com/marketplace/) — the world's largest cloud software marketplace with 14,000+ products from 3,500+ vendors. Search by keyword, browse by category, fetch specific product URLs, or explore free products.

### What does AWS Marketplace Scraper do?

AWS Marketplace Scraper fetches server-rendered product pages and extracts structured data including product title, vendor, pricing model, delivery methods, categories, ratings, reviews, descriptions, and logo URLs — all without requiring any AWS credentials or API keys.

### Features

- **Search** products by keyword (matches title, description, vendor)
- **Browse by category** (Security, ML, Databases, DevOps, and more)
- **Fetch by URL** for specific product pages
- **Free products** browse mode
- Filter by **delivery method** (AMI, Container, SaaS, Professional Services, Data Product)
- Filter by **free products only**
- Extracts **pricing model**, **delivery methods**, **categories**, **ratings**, **review counts**
- No AWS credentials, proxies, or cookies required

### Input

| Field | Type | Description |
|-------|------|-------------|
| `mode` | String | One of: `search`, `byCategory`, `byUrl`, `freeProducts` |
| `searchQuery` | String | Keyword(s) to search (mode=search) |
| `category` | String | Category filter (security, machine-learning, databases, etc.) |
| `deliveryMethod` | String | Filter by delivery method (SaaS, Container, AmazonMachineImage, etc.) |
| `freeOnly` | Boolean | Include only free or free-trial products |
| `productUrls` | Array | Product page URLs for mode=byUrl |
| `maxItems` | Integer | Maximum number of products to return (1–500, default 50) |

#### Example Input

```json
{
  "mode": "search",
  "searchQuery": "security",
  "maxItems": 50
}
````

### Output

Each item in the dataset represents one AWS Marketplace product:

| Field | Type | Description |
|-------|------|-------------|
| `productId` | String | AWS Marketplace product ID (prodview-...) |
| `productTitle` | String | Product name |
| `vendorName` | String | Vendor/publisher name |
| `vendorId` | String | Vendor identifier |
| `shortDescription` | String | Brief product description |
| `description` | String | Full product description (Markdown) |
| `categories` | Array | Product categories (e.g. Security, Machine Learning) |
| `deliveryMethods` | Array | Delivery methods (AMI, SaaS, Container, etc.) |
| `productType` | String | Primary product type |
| `pricingModel` | String | Free / Paid / Free Trial / Contract / Byol |
| `isFreeTrialAvailable` | Boolean | Whether a free trial is offered |
| `hasFreeVersion` | Boolean | Whether a free tier exists |
| `rating` | Number | Average customer rating (0–5) |
| `reviewCount` | Integer | Total number of customer reviews |
| `logoUrl` | String | Product logo image URL |
| `productUrl` | String | Full AWS Marketplace product page URL |
| `scrapedAt` | String | ISO 8601 timestamp of when the data was scraped |

#### Example Output

```json
{
  "productId": "prodview-ys5ewcmbgp6s6",
  "productTitle": "Sentieon secondary analysis tools for NGS short / long read genomic data",
  "vendorName": "Sentieon Inc.",
  "shortDescription": "Award winning Sentieon software for NGS data processing...",
  "categories": ["Healthcare & Life Sciences", "Data Analytics"],
  "deliveryMethods": ["SaaS"],
  "productType": "SaaS",
  "pricingModel": "Contract",
  "logoUrl": "https://d7umqicpi7263.cloudfront.net/img/product/...",
  "productUrl": "https://aws.amazon.com/marketplace/pp/prodview-ys5ewcmbgp6s6",
  "scrapedAt": "2026-05-23T10:00:00+00:00"
}
```

### Use Cases

- **Vendor intelligence** — Track competitor products and pricing in the AWS ecosystem
- **Market research** — Analyze which categories and delivery methods are most common
- **Lead generation** — Find vendors in specific technology categories
- **Pricing analysis** — Compare pricing models across similar products
- **Content enrichment** — Enrich product databases with category and description data

### Modes Explained

- **Search** (`search`): Scans AWS Marketplace product pages, filtering by keyword. Works by downloading the public sitemap and fetching matching product pages.
- **Browse by Category** (`byCategory`): Finds products matching a specific AWS Marketplace category.
- **Fetch by URL** (`byUrl`): Directly fetches one or more product pages you provide.
- **Free Products** (`freeProducts`): Finds products with free or free-trial pricing.

### FAQs

**Does this require AWS credentials?**
No. The actor scrapes publicly accessible product pages.

**How many products can I scrape?**
AWS Marketplace has 39,000+ product pages in the public sitemap. You can scrape up to 500 per run.

**Is this actor reliable?**
The actor uses the public, server-rendered product pages which return full data without JavaScript. Retry logic handles transient errors.

**Are ratings always available?**
Ratings are only shown for products that have received customer reviews. Many newer products have no ratings.

**How accurate is the pricing model?**
Pricing information is extracted from the dehydrated state JSON embedded in each product page. Not all products include pricing details; in those cases the field is omitted.

### Legal

This actor extracts publicly available information from AWS Marketplace product listing pages. Always comply with [AWS's Terms of Service](https://aws.amazon.com/aup/).

# Actor input Schema

## `mode` (type: `string`):

What to fetch.

## `searchQuery` (type: `string`):

Keyword(s) to search for in product title and description (mode=search).

## `category` (type: `string`):

AWS Marketplace product category (modes: byCategory, search).

## `deliveryMethod` (type: `string`):

Filter by product delivery method.

## `freeOnly` (type: `boolean`):

Include only free or free-tier products.

## `productUrls` (type: `array`):

AWS Marketplace product page URLs (e.g. https://aws.amazon.com/marketplace/pp/prodview-xxxxxxx).

## `maxItems` (type: `integer`):

Maximum number of products to scrape.

## Actor input object example

```json
{
  "mode": "search",
  "searchQuery": "security",
  "category": "",
  "deliveryMethod": "",
  "freeOnly": false,
  "maxItems": 50
}
```

# Actor output Schema

## `products` (type: `string`):

Dataset containing all scraped AWS Marketplace product listings.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "mode": "search",
    "searchQuery": "security",
    "category": "",
    "deliveryMethod": "",
    "freeOnly": false,
    "maxItems": 50
};

// Run the Actor and wait for it to finish
const run = await client.actor("crawlerbros/aws-marketplace-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "mode": "search",
    "searchQuery": "security",
    "category": "",
    "deliveryMethod": "",
    "freeOnly": False,
    "maxItems": 50,
}

# Run the Actor and wait for it to finish
run = client.actor("crawlerbros/aws-marketplace-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "mode": "search",
  "searchQuery": "security",
  "category": "",
  "deliveryMethod": "",
  "freeOnly": false,
  "maxItems": 50
}' |
apify call crawlerbros/aws-marketplace-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=crawlerbros/aws-marketplace-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "AWS Marketplace Scraper",
        "description": "Scrape AWS Marketplace product listings with search by keyword, browse by category, fetch specific product URLs, or explore free products. Extracts vendor name, pricing model, delivery methods, categories, ratings, reviews, descriptions, and logos.",
        "version": "1.0",
        "x-build-id": "ijzmwc5KOqNfUwY96"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/crawlerbros~aws-marketplace-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-crawlerbros-aws-marketplace-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/crawlerbros~aws-marketplace-scraper/runs": {
            "post": {
                "operationId": "runs-sync-crawlerbros-aws-marketplace-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/crawlerbros~aws-marketplace-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-crawlerbros-aws-marketplace-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "mode"
                ],
                "properties": {
                    "mode": {
                        "title": "Mode",
                        "enum": [
                            "search",
                            "byCategory",
                            "byUrl",
                            "freeProducts"
                        ],
                        "type": "string",
                        "description": "What to fetch.",
                        "default": "search"
                    },
                    "searchQuery": {
                        "title": "Search query",
                        "type": "string",
                        "description": "Keyword(s) to search for in product title and description (mode=search)."
                    },
                    "category": {
                        "title": "Category",
                        "enum": [
                            "",
                            "security",
                            "networking-content-delivery",
                            "storage",
                            "databases",
                            "developer-tools",
                            "business-intelligence",
                            "machine-learning",
                            "iot",
                            "media",
                            "monitoring",
                            "infrastructure-software",
                            "devops",
                            "data-products",
                            "professional-services"
                        ],
                        "type": "string",
                        "description": "AWS Marketplace product category (modes: byCategory, search).",
                        "default": ""
                    },
                    "deliveryMethod": {
                        "title": "Delivery Method",
                        "enum": [
                            "",
                            "AmazonMachineImage",
                            "Container",
                            "SaaS",
                            "ProfessionalServices",
                            "DataProduct"
                        ],
                        "type": "string",
                        "description": "Filter by product delivery method.",
                        "default": ""
                    },
                    "freeOnly": {
                        "title": "Free products only",
                        "type": "boolean",
                        "description": "Include only free or free-tier products.",
                        "default": false
                    },
                    "productUrls": {
                        "title": "Product URLs (mode=byUrl)",
                        "type": "array",
                        "description": "AWS Marketplace product page URLs (e.g. https://aws.amazon.com/marketplace/pp/prodview-xxxxxxx).",
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxItems": {
                        "title": "Max items",
                        "minimum": 1,
                        "maximum": 500,
                        "type": "integer",
                        "description": "Maximum number of products to scrape.",
                        "default": 50
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
