# Puma Product Scraper (`shahidirfan/puma-product-scraper`) Actor

Extract Puma product data at scale. Scrape prices, descriptions, images, and reviews from Puma.com. Real-time monitoring, zero blocks. Perfect for price tracking, competitive analysis, and inventory management.

- **URL**: https://apify.com/shahidirfan/puma-product-scraper.md
- **Developed by:** [Shahid Irfan](https://apify.com/shahidirfan) (community)
- **Categories:** E-commerce, Automation, Developer tools
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Puma Product Scraper

Extract Puma product listing data from category pages such as men's shoes.  
Collect pricing, ratings, color variants, size options, badges, and product links in a clean dataset ready for analysis and monitoring.

### Features

- **Fast category extraction** - Collect products directly from Puma category pages.
- **Rich product coverage** - Includes pricing, ratings, color data, badges, and size groups.
- **Pagination support** - Automatically continues through result pages until target count is reached.
- **Clean dataset output** - Omits empty fields for compact, analysis-ready records.
- **Flexible controls** - Set result limits, page size, and URL target.

### Use Cases

#### Competitive Pricing Tracking
Monitor shoe pricing and promotional deltas across Puma categories over time. Build snapshots for daily, weekly, or seasonal comparisons.

#### Merchandising Analysis
Measure assortment depth by category, color coverage, and orderable size availability. Identify which product lines have broad or narrow variant coverage.

#### Product Intelligence
Track review counts, average ratings, and badge presence to understand which products are most visible and marketable.

#### Catalog Monitoring
Detect newly listed products, removed variants, and changes in product metadata by comparing recurring runs.

---

### Input Parameters

| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| `start_url` | String | No | `https://us.puma.com/us/en/men/shoes` | Puma listing URL (category/search/tag/country). |
| `results_wanted` | Integer | No | `20` | Maximum number of products to save. |
| `page_size` | Integer | No | `24` | Number of products requested per page. |
| `proxyConfiguration` | Object | No | `{ \"useApifyProxy\": false }` | Optional Apify Proxy configuration. |

---

### Output Data

Each dataset item can include:

| Field | Type | Description |
|-------|------|-------------|
| `product_id` | String | Product variant identifier. |
| `variant_id` | String | Variant ID. |
| `master_id` | String | Master product ID. |
| `name` | String | Product name. |
| `brand` | String | Brand label. |
| `color_name` | String | Color name. |
| `color_code` | String | Color code. |
| `price_regular` | Number | Regular price. |
| `price_sale` | Number | Sale price. |
| `price_promotion` | Number | Promotion price when available. |
| `price_best` | Number | Best available price field. |
| `rating` | Number | Average rating. |
| `review_count` | Number | Number of reviews. |
| `product_url` | String | Product detail URL. |
| `image_url` | String | Primary image URL. |
| `all_image_urls` | Array | All available variant image URLs. |
| `sizes` | Array | Size group and size-level availability details. |
| `badge_labels` | Array | Product badge labels. |
| `variant_promotion_messages` | Array | Variant-level promotion messages. |
| `master_promotion_messages` | Array | Master-level promotion messages. |
| `category_name` | String | Category title. |
| `category_path` | String | Category path extracted from URL. |
| `scraped_at` | String | ISO timestamp of extraction. |

---

### Usage Examples

#### Basic Category Run

```json
{
  "start_url": "https://us.puma.com/us/en/men/shoes",
  "results_wanted": 20
}
````

#### Larger Pull

```json
{
  "start_url": "https://us.puma.com/us/en/men/shoes",
  "results_wanted": 120,
  "page_size": 24
}
```

#### Search URL Run

```json
{
  "start_url": "https://us.puma.com/us/en/search?q=running",
  "results_wanted": 60
}
```

***

### Sample Output

```json
{
  "source": "puma",
  "category_url": "https://us.puma.com/us/en/men/shoes",
  "category_path": "/men/shoes",
  "category_id": "mens-shoes",
  "category_name": "Men's Shoes and Sneakers",
  "product_id": "198556306179",
  "variant_id": "198556306179",
  "master_id": "308762",
  "sku": "308762_07",
  "product_url": "https://us.puma.com/us/en/pd/scuderia-ferrari-trinity-2-mens-sneakers/308762?swatch=07",
  "name": "Scuderia Ferrari Trinity 2 Men's Sneakers",
  "brand": "Ferrari",
  "color_name": "PUMA Black-Speed Yellow",
  "color_code": "07",
  "orderable": true,
  "price_regular": 103,
  "price_sale": 103,
  "rating": 0,
  "review_count": 0,
  "badge_labels": ["New"],
  "image_url": "https://images.puma.com/image/upload/f_auto,q_auto,b_rgb:fafafa,w_2000,h_2000/global/308762/07/sv01/fnd/PNA/fmt/png/Scuderia-Ferrari-Trinity-2-Men's-Sneakers",
  "scraped_at": "2026-04-01T10:55:22.102Z"
}
```

***

### Tips for Best Results

#### Start with QA-Friendly Limits

- Use `results_wanted: 20` for quick checks.
- Increase gradually for larger production snapshots.

#### Keep URL Category-Specific

- Use direct category URLs instead of home pages.
- Validate that the URL resolves to a listing page.

#### Handle Scale with Proxy

- Enable proxy when running frequent or high-volume schedules.
- Keep page size moderate for stable runs.

***

### Integrations

Connect scraped data with:

- **Google Sheets** - Track price and assortment changes over time.
- **Airtable** - Build searchable merchandising datasets.
- **Make** - Trigger downstream workflows after each run.
- **Zapier** - Send updates to CRMs, dashboards, or alerts.
- **Webhooks** - Push run results to your own services.

#### Export Formats

- **JSON** - For APIs and automation pipelines.
- **CSV** - For spreadsheet workflows.
- **Excel** - For reporting and business review.
- **XML** - For legacy integrations.

***

### Frequently Asked Questions

#### How many products can I scrape?

You can request any count, but practical limits depend on page availability and run time constraints.

#### Can I scrape categories other than men's shoes?

Yes. Provide a different Puma category URL in `start_url`.

#### Why are some optional fields missing in output?

The actor excludes empty values by design, so records remain compact and clean.

#### Does the actor support pagination automatically?

Yes. It paginates until your target count is reached or no more products are available.

#### Can I run this on a schedule?

Yes. Use Apify schedules for recurring data snapshots.

#### What if the target page structure changes?

Run a small test and then adjust input/filter settings as needed.

***

### Support

For issues or feature requests, use your Apify Console issue flow.

#### Resources

- [Apify Documentation](https://docs.apify.com/)
- [Apify API Reference](https://docs.apify.com/api/v2)
- [Apify Schedules](https://docs.apify.com/platform/schedules)

***

### Legal Notice

This actor is intended for lawful data collection and analysis. You are responsible for complying with website terms, local regulations, and internal data governance policies.

# Actor input Schema

## `start_url` (type: `string`):

Any Puma listing URL. Examples: category, search, tag, or country-level URL.

## `results_wanted` (type: `integer`):

Maximum number of products to collect.

## `page_size` (type: `integer`):

Number of products requested per page.

## `proxyConfiguration` (type: `object`):

Use Apify Proxy if your IP gets rate-limited.

## Actor input object example

```json
{
  "start_url": "https://us.puma.com/us/en/men/shoes",
  "results_wanted": 20,
  "page_size": 24,
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}
```

# Actor output Schema

## `overview` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "start_url": "https://us.puma.com/us/en/men/shoes",
    "results_wanted": 20,
    "page_size": 24
};

// Run the Actor and wait for it to finish
const run = await client.actor("shahidirfan/puma-product-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "start_url": "https://us.puma.com/us/en/men/shoes",
    "results_wanted": 20,
    "page_size": 24,
}

# Run the Actor and wait for it to finish
run = client.actor("shahidirfan/puma-product-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "start_url": "https://us.puma.com/us/en/men/shoes",
  "results_wanted": 20,
  "page_size": 24
}' |
apify call shahidirfan/puma-product-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=shahidirfan/puma-product-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Puma Product Scraper",
        "description": "Extract Puma product data at scale. Scrape prices, descriptions, images, and reviews from Puma.com. Real-time monitoring, zero blocks. Perfect for price tracking, competitive analysis, and inventory management.",
        "version": "1.0",
        "x-build-id": "Nd1tk4sod9eZc5ebZ"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/shahidirfan~puma-product-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-shahidirfan-puma-product-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/shahidirfan~puma-product-scraper/runs": {
            "post": {
                "operationId": "runs-sync-shahidirfan-puma-product-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/shahidirfan~puma-product-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-shahidirfan-puma-product-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "start_url": {
                        "title": "Puma URL",
                        "type": "string",
                        "description": "Any Puma listing URL. Examples: category, search, tag, or country-level URL."
                    },
                    "results_wanted": {
                        "title": "Results wanted",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Maximum number of products to collect.",
                        "default": 20
                    },
                    "page_size": {
                        "title": "Page size",
                        "minimum": 1,
                        "maximum": 48,
                        "type": "integer",
                        "description": "Number of products requested per page.",
                        "default": 24
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Use Apify Proxy if your IP gets rate-limited.",
                        "default": {
                            "useApifyProxy": false
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
