# Shopify Product Scraper (`glassventures/shopify-product-scraper`) Actor

Extract products, prices, variants, and images from any Shopify store. Uses JSON API for fastest extraction. No browser needed.

- **URL**: https://apify.com/glassventures/shopify-product-scraper.md
- **Developed by:** [Glass Ventures](https://apify.com/glassventures) (community)
- **Categories:** E-commerce
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Shopify Product Scraper

Extract products, prices, variants, and images from any Shopify store. Uses Shopify's built-in JSON API for the fastest and most reliable extraction -- no browser needed.

### What does Shopify Product Scraper do?

Shopify Product Scraper extracts complete product data from any Shopify-powered online store. It leverages Shopify's native `/products.json` API endpoint, which means it is significantly faster and more reliable than scrapers that parse HTML. No browser rendering, no JavaScript execution -- just clean, structured JSON data.

The actor supports multiple input methods: provide a store URL to scrape all products, a collection URL to scrape a specific category, or individual product URLs for targeted extraction. It automatically handles pagination (up to 250 products per page) and deduplicates results.

Every product includes full variant data (sizes, colors, SKUs, individual prices), all product images with dimensions and alt text, tags, vendor information, and availability status -- making it ideal for price monitoring, competitor analysis, and product research.

### Use Cases

- **Competitor price monitoring** -- Track competitor prices and stock levels across Shopify stores. Get compare-at prices to detect discounts and sales.
- **Product research & dropshipping** -- Find trending products, compare vendors, and analyze product catalogs for dropshipping opportunities.
- **Market analysis** -- Analyze product categories, pricing strategies, and inventory across multiple Shopify stores.
- **Data analysts** -- Build product databases, price history datasets, and e-commerce market intelligence reports.
- **Developers** -- Integrate Shopify product data into your apps via Apify API, webhooks, or direct dataset access.

### Features

- Scrape ALL products from any Shopify store
- Support for store URLs, collection URLs, and individual product URLs
- Full variant extraction: size, color, SKU, price per variant
- All product images with alt text and dimensions
- Compare-at prices for discount tracking
- Automatic pagination (250 products per page, Shopify max)
- Built-in deduplication
- Uses Shopify JSON API -- fastest possible extraction
- Proxy support for stores that rate-limit
- Exports to JSON, CSV, Excel, or connect via API

### Pricing

This actor is **free to use** -- you only pay for Apify platform compute time.

| Products | Estimated Cost | Time   |
|----------|---------------|--------|
| 100      | ~$0.01        | ~10s   |
| 1,000    | ~$0.05        | ~1min  |
| 10,000   | ~$0.25        | ~5min  |

Costs are minimal because the actor uses HTTP requests only (no browser), and most Shopify stores do not require proxies.

### How to use

1. Go to the Shopify Product Scraper page on Apify Store
2. Click "Start" or "Try for free"
3. Enter one or more Shopify store URLs, collection URLs, or product URLs
4. Set the maximum number of products to scrape
5. Click "Start" and wait for results

### How to find Shopify stores

Shopify stores can be identified in several ways:

- **Direct Shopify domains**: Many stores use `store-name.myshopify.com`
- **Custom domains**: Most stores use custom domains. You can check if a site is Shopify by visiting `example.com/products.json` -- if it returns JSON data, it is a Shopify store.
- **Built With tools**: Use services like BuiltWith or Wappalyzer to detect Shopify-powered sites.
- **Page source**: Look for `cdn.shopify.com` in the page source code.

### Input parameters

| Parameter | Type | Description | Default |
|-----------|------|-------------|---------|
| startUrls | array | Shopify store, product, or collection URLs | - |
| maxItems | number | Maximum products to scrape (0 = unlimited) | 100 |
| includeVariants | boolean | Include size/color/SKU variants | true |
| includeHtml | boolean | Include raw HTML product description | false |
| proxyConfig | object | Proxy settings (usually not needed) | No proxy |

### Output

The actor produces a dataset with the following fields:

```json
{
    "url": "https://www.allbirds.com/products/wool-runners",
    "title": "Men's Wool Runners",
    "handle": "wool-runners",
    "vendor": "Allbirds",
    "productType": "Shoes",
    "tags": ["mens", "runners", "wool"],
    "price": 110,
    "compareAtPrice": null,
    "currency": null,
    "available": true,
    "description": "Our original shoe, made with superfine merino wool...",
    "images": [
        {
            "src": "https://cdn.shopify.com/s/files/1/image.jpg",
            "alt": "Men's Wool Runners",
            "width": 1600,
            "height": 1600
        }
    ],
    "variants": [
        {
            "title": "8 / Natural Black (Black Sole)",
            "price": 110,
            "compareAtPrice": null,
            "sku": "WR-NNZ-NB-M8",
            "available": true,
            "option1": "8",
            "option2": "Natural Black (Black Sole)",
            "option3": null
        }
    ],
    "createdAt": "2023-05-15T10:30:00-04:00",
    "updatedAt": "2024-12-01T08:15:00-05:00",
    "scrapedAt": "2026-04-23T14:30:00.000Z"
}
````

| Field | Type | Description |
|-------|------|-------------|
| url | string | Product page URL |
| title | string | Product title |
| handle | string | URL-friendly product slug |
| vendor | string | Brand or vendor name |
| productType | string | Product category |
| tags | array | Product tags |
| price | number | Lowest variant price |
| compareAtPrice | number | Original price (before discount) |
| currency | string | Currency code (when available) |
| available | boolean | Whether any variant is in stock |
| description | string | Plain text product description |
| images | array | Product images with src, alt, dimensions |
| variants | array | Variants with title, price, sku, options |
| createdAt | string | Product creation date |
| updatedAt | string | Last update date |
| scrapedAt | string | ISO 8601 scrape timestamp |

### Integrations

Connect Shopify Product Scraper with other tools:

- **Apify API** -- REST API for programmatic access
- **Webhooks** -- Get notified when a run finishes
- **Zapier / Make** -- Connect to 5,000+ apps
- **Google Sheets** -- Export directly to spreadsheets

#### API Example (Node.js)

```javascript
import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_TOKEN' });
const run = await client.actor('YOUR_USERNAME/shopify-product-scraper').call({
    startUrls: [{ url: 'https://www.allbirds.com' }],
    maxItems: 100,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);
```

#### API Example (Python)

```python
from apify_client import ApifyClient
client = ApifyClient('YOUR_TOKEN')
run = client.actor('YOUR_USERNAME/shopify-product-scraper').call(run_input={
    'startUrls': [{'url': 'https://www.allbirds.com'}],
    'maxItems': 100,
})
items = client.dataset(run['defaultDatasetId']).list_items().items
print(items)
```

#### API Example (cURL)

```bash
curl "https://api.apify.com/v2/acts/YOUR_USERNAME~shopify-product-scraper/runs" \
  -X POST \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -d '{"startUrls": [{"url": "https://www.allbirds.com"}], "maxItems": 100}'
```

### Tips and tricks

- Start with a small `maxItems` (10-20) to test before running large scrapes
- Most Shopify stores work without proxies. Enable Apify Proxy only if you hit rate limits.
- Use collection URLs to scrape specific product categories (e.g., `https://store.com/collections/sale`)
- The actor scrapes up to 250 products per API call, making it extremely fast
- Set `includeVariants: true` to get per-variant pricing for accurate price comparison
- Use `includeHtml: false` (default) for cleaner data; enable it only if you need rich text formatting

### FAQ

**Q: Does this actor require login credentials?**
A: No. It only accesses publicly available product data through Shopify's JSON API.

**Q: How fast is the scraping?**
A: Extremely fast. The actor can scrape 1,000+ products per minute since it uses direct JSON API calls without browser rendering.

**Q: What should I do if I get blocked?**
A: Enable Apify Proxy in the Proxy Configuration settings. Datacenter proxies usually work; switch to residential if needed.

**Q: Does it work with password-protected stores?**
A: No. Password-protected Shopify stores block API access. The actor will report an access denied error.

**Q: Why is currency sometimes null?**
A: Shopify's `/products.json` endpoint does not include currency information. Prices are in the store's default currency (usually visible on the storefront).

**Q: Can I scrape specific collections?**
A: Yes. Provide a collection URL like `https://store.com/collections/sale` and the actor will only scrape products from that collection.

### Is it legal to scrape Shopify stores?

Web scraping of publicly available data is generally legal based on precedents like the LinkedIn v. HiQ Labs case. This actor only accesses publicly available product data through Shopify's built-in JSON API. Always review and respect the target store's Terms of Service. For more information, see [Apify's blog on web scraping legality](https://blog.apify.com/is-web-scraping-legal/).

### Limitations

- Password-protected Shopify stores cannot be scraped
- Currency is not included in the JSON API response (prices are in the store's default currency)
- Some stores may rate-limit requests; use proxy configuration if this happens
- The Shopify JSON API has a maximum of 250 products per page
- Very large stores (50,000+ products) may take longer due to pagination

### Changelog

- **v0.1** (2026-04-23) -- Initial release

# Actor input Schema

## `startUrls` (type: `array`):

Shopify store URLs, product URLs, or collection URLs. Examples: https://www.allbirds.com, https://www.allbirds.com/collections/mens, https://www.allbirds.com/products/wool-runners

## `maxItems` (type: `integer`):

Maximum number of products to scrape. Use 0 or leave empty for unlimited.

## `includeVariants` (type: `boolean`):

Include product variants (size, color, SKU) in the output. Each variant has its own price, availability, and options.

## `includeHtml` (type: `boolean`):

Include the raw HTML product description (body\_html). When disabled, only plain text description is included.

## `proxyConfig` (type: `object`):

Most Shopify stores work without proxies. Enable Apify Proxy if you encounter rate limiting.

## Actor input object example

```json
{
  "startUrls": [
    {
      "url": "https://www.allbirds.com"
    }
  ],
  "maxItems": 100,
  "includeVariants": true,
  "includeHtml": false,
  "proxyConfig": {
    "useApifyProxy": false
  }
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        {
            "url": "https://www.allbirds.com"
        }
    ],
    "proxyConfig": {
        "useApifyProxy": false
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("glassventures/shopify-product-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "startUrls": [{ "url": "https://www.allbirds.com" }],
    "proxyConfig": { "useApifyProxy": False },
}

# Run the Actor and wait for it to finish
run = client.actor("glassventures/shopify-product-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    {
      "url": "https://www.allbirds.com"
    }
  ],
  "proxyConfig": {
    "useApifyProxy": false
  }
}' |
apify call glassventures/shopify-product-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=glassventures/shopify-product-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Shopify Product Scraper",
        "description": "Extract products, prices, variants, and images from any Shopify store. Uses JSON API for fastest extraction. No browser needed.",
        "version": "0.1",
        "x-build-id": "CST6savYDycwe8nwd"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/glassventures~shopify-product-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-glassventures-shopify-product-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/glassventures~shopify-product-scraper/runs": {
            "post": {
                "operationId": "runs-sync-glassventures-shopify-product-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/glassventures~shopify-product-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-glassventures-shopify-product-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "startUrls": {
                        "title": "Start URLs",
                        "type": "array",
                        "description": "Shopify store URLs, product URLs, or collection URLs. Examples: https://www.allbirds.com, https://www.allbirds.com/collections/mens, https://www.allbirds.com/products/wool-runners",
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "maxItems": {
                        "title": "Max Items",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Maximum number of products to scrape. Use 0 or leave empty for unlimited.",
                        "default": 100
                    },
                    "includeVariants": {
                        "title": "Include Variants",
                        "type": "boolean",
                        "description": "Include product variants (size, color, SKU) in the output. Each variant has its own price, availability, and options.",
                        "default": true
                    },
                    "includeHtml": {
                        "title": "Include HTML Description",
                        "type": "boolean",
                        "description": "Include the raw HTML product description (body_html). When disabled, only plain text description is included.",
                        "default": false
                    },
                    "proxyConfig": {
                        "title": "Proxy Configuration",
                        "type": "object",
                        "description": "Most Shopify stores work without proxies. Enable Apify Proxy if you encounter rate limiting."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
