# Shopify Scraper (`crawlerbros/shopify-scraper`) Actor

Scrape products from any Shopify store. Extract product titles, prices, variants, SKUs, images, descriptions, inventory availability, and more using Shopify's public products.json API.

- **URL**: https://apify.com/crawlerbros/shopify-scraper.md
- **Developed by:** [Crawler Bros](https://apify.com/crawlerbros) (community)
- **Categories:** E-commerce, Lead generation, Other
- **Stats:** 3 total users, 2 monthly users, 100.0% runs succeeded, 9 bookmarks
- **User rating**: 5.00 out of 5 stars

## Pricing

from $1.00 / 1,000 results

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Shopify Scraper

Scrape product data from any Shopify store. Extract product titles, descriptions, prices, variants, SKUs, images, tags, inventory availability, and more.

### What can this scraper do?

- **Scrape any Shopify store** — Works with all Shopify-powered online stores (custom domains and .myshopify.com)
- **Product details** — Extract titles, descriptions, vendors, product types, and tags
- **Pricing data** — Get current prices and compare-at prices for sale detection
- **Variants and SKUs** — Full variant information including SKU, price, and stock availability per variant
- **Product images** — All image URLs for each product
- **Inventory status** — Check which products and variants are currently in stock
- **Fast and lightweight** — Uses Shopify's public products.json API with pure HTTP requests (no browser needed)
- **Bulk extraction** — Scrape hundreds or thousands of products with automatic pagination

### Input

| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `storeUrl` | string | Yes | — | Shopify store URL (e.g., `https://www.allbirds.com` or `https://shop-name.myshopify.com`) |
| `maxProducts` | integer | No | 50 | Maximum number of products to scrape (minimum 1) |
| `proxy` | object | No | — | Proxy configuration. Use residential proxy if you get 403 errors |

#### Example input

```json
{
    "storeUrl": "https://www.allbirds.com",
    "maxProducts": 100
}
````

### Output

Each product in the dataset contains the following fields:

| Field | Type | Description |
|-------|------|-------------|
| `id` | number | Shopify product ID |
| `title` | string | Product title |
| `handle` | string | URL-friendly product slug |
| `url` | string | Full product page URL |
| `description` | string | Product description (HTML stripped) |
| `vendor` | string | Product vendor or brand name |
| `productType` | string | Product category or type |
| `tags` | array | List of product tags |
| `price` | string | Current price (from first variant) |
| `compareAtPrice` | string | Original price before discount |
| `available` | boolean | Whether any variant is in stock |
| `variants` | array | All product variants with id, title, sku, price, compareAtPrice, available, option1/2/3 |
| `images` | array | All product image URLs |
| `createdAt` | string | When the product was created in the store |
| `updatedAt` | string | When the product was last updated |
| `storeUrl` | string | The store URL that was scraped |
| `scrapedAt` | string | ISO 8601 timestamp of when the data was collected |

#### Sample output

```json
{
    "id": 6707741982800,
    "title": "Men's Tree Runners",
    "handle": "mens-tree-runners",
    "url": "https://www.allbirds.com/products/mens-tree-runners",
    "description": "Our breathable, silky-smooth sneaker made with responsibly sourced eucalyptus tree fiber.",
    "vendor": "Allbirds",
    "productType": "Shoes",
    "tags": ["men", "runners", "tree"],
    "price": "98.00",
    "compareAtPrice": "",
    "available": true,
    "variants": [
        {
            "id": 39804040413264,
            "title": "8 / Kauri Marine Blue (Medium Grey Sole)",
            "sku": "TR-MBG-08",
            "price": "98.00",
            "compareAtPrice": "",
            "available": true,
            "option1": "8",
            "option2": "Kauri Marine Blue (Medium Grey Sole)",
            "option3": ""
        }
    ],
    "images": [
        "https://cdn.shopify.com/s/files/1/0018/6832/2368/products/image.jpg"
    ],
    "createdAt": "2022-07-08T14:11:57-07:00",
    "updatedAt": "2026-04-02T03:26:57-07:00",
    "storeUrl": "https://www.allbirds.com",
    "scrapedAt": "2026-04-02T12:00:00.000000+00:00"
}
```

### How it works

1. **URL normalization** — Extracts the store domain from any input URL
2. **API pagination** — Fetches products from the Shopify `/products.json` endpoint in batches of up to 250
3. **Data parsing** — Converts raw Shopify JSON into a clean, consistent output format
4. **Rate limiting** — Adds a polite 1-second delay between page requests to avoid overwhelming the store

### Tips for best results

- Start with a small `maxProducts` value (5-10) to verify the store works before running larger jobs
- Most Shopify stores support the `/products.json` endpoint, but some may have it disabled or behind a WAF
- If you get 403 errors, try enabling a residential proxy in the input configuration
- The scraper fetches up to 250 products per API request, so even large catalogs are scraped efficiently
- Custom Shopify domains (e.g., `www.allbirds.com`) and default `.myshopify.com` domains both work

### Limitations

- Only works with Shopify-powered stores — other e-commerce platforms are not supported
- Some stores disable the `/products.json` API endpoint or protect it with a firewall
- Product descriptions are stripped of HTML formatting (plain text only)
- The `price` and `compareAtPrice` fields reflect the first variant's pricing
- Stores with password-protected storefronts cannot be scraped

### Frequently Asked Questions

**Do I need an API key or Shopify account?**
No. This scraper uses the public `/products.json` endpoint that is available on most Shopify stores without authentication.

**How do I know if a store is built with Shopify?**
Try adding `/products.json` to the store URL. If it returns a JSON response with a `products` array, the store is Shopify-powered and this scraper will work.

**Can I scrape a password-protected Shopify store?**
No. Password-protected Shopify storefronts require authentication that this scraper does not support.

**How many products can I scrape?**
There is no hard limit. Set `maxProducts` to any value. The scraper paginates automatically through the store's entire catalog. Shopify returns up to 250 products per page.

**Why am I getting 403 errors?**
Some Shopify stores use a Web Application Firewall (WAF) like Cloudflare that blocks automated requests. Enable the proxy option with a residential proxy to bypass this.

**What data is available per variant?**
Each variant includes its ID, title, SKU, price, compare-at price, stock availability, and up to three option values (e.g., size, color, material).

**Can I get the full HTML product description?**
The scraper strips HTML tags for clean text output. The raw HTML is not included in the output.

**Does this work with Shopify Plus stores?**
Yes. Shopify Plus stores use the same `/products.json` endpoint as standard Shopify stores.

**How fast is this scraper?**
Very fast. Since it uses direct HTTP requests (no browser), it can scrape 250 products per second. A 1-second delay between pages keeps requests polite and avoids rate limiting.

# Actor input Schema

## `storeUrl` (type: `string`):

URL of the Shopify store to scrape. Can be the homepage or any page on the store (e.g., https://www.allbirds.com or https://shop-name.myshopify.com).

## `maxProducts` (type: `integer`):

Maximum number of products to scrape from the store.

## `proxy` (type: `object`):

Optional proxy settings. Some Shopify stores may block datacenter IPs — use residential proxy if you get 403 errors.

## Actor input object example

```json
{
  "storeUrl": "https://www.allbirds.com",
  "maxProducts": 5
}
```

# Actor output Schema

## `products` (type: `string`):

Dataset containing all scraped Shopify products

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "storeUrl": "https://www.allbirds.com",
    "maxProducts": 5
};

// Run the Actor and wait for it to finish
const run = await client.actor("crawlerbros/shopify-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "storeUrl": "https://www.allbirds.com",
    "maxProducts": 5,
}

# Run the Actor and wait for it to finish
run = client.actor("crawlerbros/shopify-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "storeUrl": "https://www.allbirds.com",
  "maxProducts": 5
}' |
apify call crawlerbros/shopify-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=crawlerbros/shopify-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Shopify Scraper",
        "description": "Scrape products from any Shopify store. Extract product titles, prices, variants, SKUs, images, descriptions, inventory availability, and more using Shopify's public products.json API.",
        "version": "1.0",
        "x-build-id": "4yJQhdnjxq9xSuN3U"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/crawlerbros~shopify-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-crawlerbros-shopify-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/crawlerbros~shopify-scraper/runs": {
            "post": {
                "operationId": "runs-sync-crawlerbros-shopify-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/crawlerbros~shopify-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-crawlerbros-shopify-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "storeUrl"
                ],
                "properties": {
                    "storeUrl": {
                        "title": "Shopify Store URL",
                        "type": "string",
                        "description": "URL of the Shopify store to scrape. Can be the homepage or any page on the store (e.g., https://www.allbirds.com or https://shop-name.myshopify.com)."
                    },
                    "maxProducts": {
                        "title": "Max Products",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Maximum number of products to scrape from the store.",
                        "default": 50
                    },
                    "proxy": {
                        "title": "Proxy Configuration",
                        "type": "object",
                        "description": "Optional proxy settings. Some Shopify stores may block datacenter IPs — use residential proxy if you get 403 errors."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
