# 🧩 Shopify Product Scraper - Apps Spy + Reviews (`kazkn/shopify-store-scraper`) Actor

Detect which apps any Shopify store has installed (Klaviyo, Recharge, Yotpo, Privy + 30 more). Plus full product catalog & reviews. No login. 5x cheaper.

- **URL**: https://apify.com/kazkn/shopify-store-scraper.md
- **Developed by:** [KazKN](https://apify.com/kazkn) (community)
- **Categories:** E-commerce, Lead generation, Developer tools
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $2.00 / 1,000 product extracteds

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## 🧩 Shopify Scraper — Apps Spy + Products + Reviews

**Paste a [Shopify](https://www.shopify.com) store URL, click Run, and get the full tech stack + product catalog + reviews in seconds.**

🧩 **Shopify Scraper** is the only [Apify Actor](https://apify.com/store) that auto-detects the **150+ apps installed on any Shopify store** ([Klaviyo](https://www.klaviyo.com), [Recharge](https://rechargepayments.com), [Yotpo](https://www.yotpo.com), [Judge.me](https://judge.me), [Privy](https://www.privy.com), [Gorgias](https://www.gorgias.com), [Algolia](https://www.algolia.com), [Okendo](https://www.okendo.com), [Iterable](https://iterable.com), [Bloomreach](https://www.bloomreach.com), and many more) — combined with the **full product catalog** via the official [`/products.json`](https://shopify.dev/docs/api/ajax/section/product) endpoint and **reviews** from every major review provider.

Built for B2B SaaS lead-gen, DTC competitive intel, and agency tech audits. **No login. No API key. No proxy required for most stores.**

---

### 🔍 What does this Shopify scraper do?

🧩 Shopify Scraper turns any Shopify URL into structured intelligence in **under 30 seconds**.

You simply:

* paste one or more Shopify URLs (custom domain or `*.myshopify.com`)
* pick an extraction level (Basic, Standard, Full, Pro)
* click Run
* export results as JSON, CSV, or Excel

The actor extracts:

* 🛍️ **Full product catalog** — title, handle, vendor, type, tags, description, prices, variants, images, inventory signals
* 🧩 **Detected apps** — [Klaviyo](https://www.klaviyo.com), [Recharge](https://rechargepayments.com), [Yotpo](https://www.yotpo.com), [Judge.me](https://judge.me), [Privy](https://www.privy.com), [Gorgias](https://www.gorgias.com), [Algolia](https://www.algolia.com), [Okendo](https://www.okendo.com), [Iterable](https://iterable.com), [Bloomreach](https://www.bloomreach.com), [Braze](https://www.braze.com), [Smile.io](https://smile.io), [LoyaltyLion](https://loyaltylion.com), [Weglot](https://www.weglot.com), [Rebuy](https://www.rebuyengine.com), [Shogun](https://getshogun.com), [PageFly](https://pagefly.io), [Hotjar](https://www.hotjar.com), [GTM](https://tagmanager.google.com), [Facebook Pixel](https://www.facebook.com/business/help/952192354843755), [TikTok Pixel](https://ads.tiktok.com), and 30+ more
* ⭐ **Reviews** from [Yotpo](https://www.yotpo.com), [Judge.me](https://judge.me), [Stamped](https://stamped.io), [Okendo](https://www.okendo.com), [Loox](https://loox.io) (Full mode)
* 💰 **Estimated revenue** from scarcity signals + review velocity (Pro mode)
* 🏷️ **Store metadata** — name, currency, country, myshopify domain, total products

---

### 🎯 Who is this Shopify scraper for?

* 🤖 **B2B SaaS founders** targeting Shopify merchants — enrich your outbound list with installed-apps data nobody else has
* 📊 **DTC operators and analysts** — snapshot competitor catalogs, watch what apps they install or churn, compare regional storefronts
* 🛍️ **Dropshippers** — find winning products in Pro mode using review-velocity revenue estimates
* 🏪 **Agencies** — deliver clean tech-stack audits to your Shopify clients in minutes instead of hours

---

### ⚡ Why use this Shopify scraper?

* 🧩 **Tech stack detection nobody else has** — the unique value vs the 8 other [Shopify scrapers on Apify Store](https://apify.com/store?search=shopify)
* 🌍 **Multi-region tested** — US, UK, EU, France, Italy, Spain, Germany all validated end-to-end
* 🚀 **Sub-30s per store** — scrapes 250 products + apps + reviews faster than any browser-based competitor
* 💰 **5x cheaper than the leader** — $0.002 per product vs $0.009 at the next paid Apify Shopify scraper
* 🛡️ **Zero browser** — pure HTTP + JSON, sub-300 MB memory tier, never breaks on JS rendering
* ✅ **Sitemap fallback** — if a store disables [`/products.json`](https://shopify.dev/docs/api/ajax/section/product), the actor automatically reads the sitemap
* 📦 **Real-time dataset** — products stream into your dataset as they're extracted, no batch wait
* 🔁 **Batch up to 100 stores** in a single run

> 💡 If you only need products without apps detection, use any commodity [Shopify scraper](https://apify.com/store?search=shopify). If you want the **stack intel**, this is the only actor that gives it to you.

---

### 🚀 How to scrape Shopify products in 3 steps

Setting up 🧩 Shopify Scraper takes **less than a minute**:

1. Open the actor on [Apify Console](https://console.apify.com/actors/kazkn~shopify-store-scraper) and paste one or more **Shopify store URLs** in the input field (custom domain like `https://allbirds.com` or any `*.myshopify.com`).
2. Pick an **Extraction Level** that matches what you need:
   * **Basic** — products only
   * **Standard** (default) — products + tech stack detection
   * **Full** — Standard + reviews from [Yotpo](https://www.yotpo.com), [Judge.me](https://judge.me), [Stamped](https://stamped.io), [Okendo](https://www.okendo.com), [Loox](https://loox.io)
   * **Pro** — Full + revenue estimation
3. Click **Run** — then download your dataset from the **Storage** tab as JSON, CSV, or Excel.

> 💡 **Pro tip:** filter by collection handle to scrape only `/collections/sneakers` or any specific catalog slice instead of the whole store.

#### Input example

```json
{
  "store_urls": [
    "https://allbirds.com",
    "https://glossier.com",
    "https://magicspoon.com"
  ],
  "extract_level": "full",
  "max_products_per_store": 250,
  "max_reviews_per_product": 20,
  "max_concurrent_stores": 3,
  "use_residential_proxy": false
}
````

> 💡 Set `use_residential_proxy: true` only when you scrape Cloudflare-protected stores like Tesla Shop or Manscaped. Datacenter proxy is enough for 98% of Shopify stores. Learn more about [Apify Proxy](https://docs.apify.com/platform/proxy).

***

### 🧩 Detected apps catalog (50+ apps, growing)

The detector spots the entire installed tech stack of any Shopify by running a library of regex patterns on the home page and one product page HTML.

| Category | Detected apps |
|---|---|
| ⭐ **Reviews** | [Yotpo](https://www.yotpo.com) · [Judge.me](https://judge.me) · [Loox](https://loox.io) · [Stamped](https://stamped.io) · [Okendo](https://www.okendo.com) · [Reviews.io](https://www.reviews.io) · [Trustpilot](https://www.trustpilot.com) · Shopify legacy reviews |
| 📧 **Email marketing** | [Klaviyo](https://www.klaviyo.com) · [Mailchimp](https://mailchimp.com) · [Omnisend](https://www.omnisend.com) · [Drip](https://www.drip.com) · **[Iterable](https://iterable.com)** · **[Bloomreach Engagement](https://www.bloomreach.com/en/products/engagement)** · **[Braze](https://www.braze.com)** |
| 🎯 **Popups / opt-in** | [Privy](https://www.privy.com) · [Justuno](https://www.justuno.com) · [OptiMonk](https://www.optimonk.com) |
| 🔁 **Subscriptions** | [Recharge](https://rechargepayments.com) · [Bold Subscriptions](https://boldcommerce.com/subscriptions) · [Appstle](https://appstle.com) |
| 🎁 **Loyalty / rewards** | [Smile.io](https://smile.io) · [Yotpo Loyalty (Swell)](https://www.yotpo.com/platform/loyalty) · [LoyaltyLion](https://loyaltylion.com) |
| 💬 **Live chat / helpdesk** | [Gorgias](https://www.gorgias.com) · [Tidio](https://www.tidio.com) · [Intercom](https://www.intercom.com) · [Zendesk](https://www.zendesk.com) · [Drift](https://www.drift.com) |
| 🔎 **Search** | [Algolia](https://www.algolia.com) · [Searchanise](https://searchanise.io) · [Klevu](https://www.klevu.com) |
| 🎨 **Page builders** | [Shogun](https://getshogun.com) · [PageFly](https://pagefly.io) · [GemPages](https://gempages.net) · [Zipify](https://zipify.com) |
| 🌐 **Currency / i18n** | Currency Converter Plus · [Weglot](https://www.weglot.com) · [LangShop](https://langshop.app) |
| 🛒 **Upsell / cart** | [ReConvert](https://www.reconvert.io) · [Bold Upsell](https://boldcommerce.com/upsell-cross-sell) · In Cart Upsell · [Rebuy](https://www.rebuyengine.com) |
| 📈 **Analytics / pixels** | [Hotjar](https://www.hotjar.com) · [GTM](https://tagmanager.google.com) · [GA4](https://analytics.google.com) · [Facebook Pixel](https://www.facebook.com/business/help/952192354843755) · [TikTok Pixel](https://ads.tiktok.com) · [Pinterest Tag](https://help.pinterest.com/en/business/article/install-the-pinterest-tag) · [Snapchat Pixel](https://businesshelp.snapchat.com/s/article/pixel-direct-implementation) |

> 💡 Don't see an app you care about? The patterns library is open — open an issue with a link to a store using it and I add the regex within 24h.

***

### 💰 Pricing

🧩 Shopify Scraper uses **pay-per-event** pricing. You only pay for what you actually extract:

| Event | When it fires | Price |
|---|---|---|
| **Actor start** | Once per run | $0.05 |
| **Store analyzed** | Once per Shopify store with products | $0.005 |
| **Product extracted** | Per product row pushed | $0.002 |
| **Apps detected** | Per store at Standard or higher | $0.05 |
| **Review extracted** | Per review row pushed | $0.0005 |
| **Revenue estimated** | Per store at Pro level | $0.10 |

- No monthly subscription — pay only for what you use
- [Apify Free plan](https://apify.com/pricing) includes **$5/month of platform credits** — enough to scan ~100 stores in tech-stack mode at no cost
- Set `Max Total Charge USD` on every run to cap your spend

#### 💸 Real cost examples

- **Scan 100 Shopify stores for their tech stack** (B2B SaaS lead-gen) — about **$10.55**
- **Deep audit 5 competitors with reviews** (DTC competitive intel) — about **$5.95**
- **Pro-mode dropship research on 50 stores** — about **$10.05**

Compare to [PPSPY](https://www.ppspy.com) ($24/month), [Koala Inspector](https://koalainspector.com) ($9.99/month), [Charm.io](https://charm.io) ($299/month), [Shophunter](https://shophunter.io) ($99/month) — and none of them give you bulk API access or installed-apps data.

Learn more about [Apify pricing](https://apify.com/pricing).

***

### 📦 Output format

The actor writes to **four datasets** that you can export independently to JSON, CSV, Excel, [Google Sheets](https://apify.com/integrations/google-sheets), [Airtable](https://apify.com/integrations/airtable), [Slack](https://apify.com/integrations/slack), or any [Apify integration](https://apify.com/integrations).

#### Default dataset — products

```json
{
  "store_url": "https://allbirds.com",
  "store_domain": "allbirds.com",
  "store_platform": "shopify",
  "scraped_at": "2026-04-30T22:15:30Z",
  "extract_level": "standard",

  "product_id": 7894123456,
  "product_handle": "wool-runner-mizzles-natural-white",
  "product_title": "Wool Runner Mizzles - Natural White",
  "product_url": "https://allbirds.com/products/wool-runner-mizzles-natural-white",
  "product_type": "Sneakers",
  "vendor": "Allbirds",
  "tags": ["wool", "waterproof"],
  "description_html": "<p>...</p>",
  "description_text": "Plain text description",
  "created_at": "2024-09-15T10:00:00Z",
  "updated_at": "2026-04-28T14:00:00Z",

  "price": 135.0,
  "price_min": 135.0,
  "price_max": 135.0,
  "compare_at_price": null,
  "currency": "USD",
  "available": true,
  "available_variant_count": 8,
  "total_variant_count": 12,

  "images": ["https://cdn.shopify.com/...jpg"],
  "main_image": "https://cdn.shopify.com/...jpg",
  "image_count": 6,

  "store_meta": {
    "name": "Allbirds",
    "currency": "USD",
    "country": "US",
    "myshopify_domain": "allbirds-2.myshopify.com"
  },

  "apps_detected": {
    "reviews": ["yotpo"],
    "email_marketing": ["iterable"],
    "analytics": ["google_tag_manager"],
    "all_apps_raw": ["yotpo", "iterable", "google_tag_manager"],
    "detected_count": 3
  }
}
```

#### Named dataset `apps`

One row per scraped store with the full tech stack and diagnostic metadata:

```json
{
  "store_url": "https://magicspoon.com",
  "store_domain": "magicspoon.com",
  "myshopify_domain": "magicspoon-cereal.myshopify.com",
  "detected_at": "2026-04-30T22:15:30Z",
  "reviews": ["okendo"],
  "email_marketing": ["klaviyo"],
  "subscriptions": ["recharge"],
  "page_builder": ["shogun"],
  "upsell": ["rebuy"],
  "analytics": ["google_tag_manager"],
  "all_apps_raw": ["okendo", "klaviyo", "recharge", "shogun", "rebuy", "google_tag_manager"],
  "detected_count": 6
}
```

#### Named dataset `reviews`

One row per review with normalized provider-agnostic fields:

```json
{
  "store_domain": "glossier.com",
  "product_id": 9576351138037,
  "product_handle": "spring-pinks",
  "review_id": "yotpo_12345",
  "provider": "yotpo",
  "rating": 5,
  "title": "Best lip balm ever",
  "body": "I wear them every day...",
  "author_name": "Sarah J.",
  "author_verified": true,
  "created_at": "2026-03-12T08:00:00Z",
  "helpful_count": 14,
  "images": ["https://cdn.yotpo.com/..."]
}
```

#### Named dataset `revenue` (Pro mode placeholder, full impl in v2)

***

### 💡 Tips for scraping Shopify efficiently

- 📈 **Batch multiple stores** in a single run — the actor runs them in parallel
- 🌐 **Multi-region brands** — pass `https://allbirds.com`, `https://allbirds.eu`, `https://allbirds.co.uk` to compare regional storefronts (different stacks, different currencies)
- 🛡️ **Hit a 403?** Toggle `Use Residential Proxy` for Cloudflare-protected stores ([Apify Proxy docs](https://docs.apify.com/platform/proxy))
- 🔄 **Schedule weekly runs** with [Apify Scheduler](https://apify.com/schedules) to track product changes and price drops over time
- 🪝 **Push to Sheets, Slack, Airtable** via [Apify integrations](https://apify.com/integrations) — no glue code needed
- 🤖 **Run from any AI agent** ([Claude](https://claude.ai), [Cursor](https://cursor.com), [Windsurf](https://codeium.com/windsurf)) using the [Apify MCP server](https://apify.com/apify/actors-mcp-server)

***

### 🌍 Supported Shopify markets and stores

🧩 Shopify Scraper works on **any [Shopify](https://www.shopify.com) storefront worldwide** — custom domain, `*.myshopify.com`, regional subdomains, [Shopify Plus](https://www.shopify.com/plus).

✅ Empirically tested on stores from these countries before publishing:

🇺🇸 United States · 🇬🇧 United Kingdom · 🇪🇺 European Union · 🇫🇷 France · 🇩🇪 Germany · 🇮🇹 Italy · 🇪🇸 Spain

The endpoint we use ([`/products.json`](https://shopify.dev/docs/api/ajax/section/product)) is a public, version-less Shopify spec that has been stable since 2014. **Country, currency, and language do not affect detection** — the actor handles every Shopify storefront identically.

***

### 🧭 When to use Shopify Scraper vs alternatives

Use **🧩 Shopify Scraper** if you want:

- The only [Apify Shopify scraper](https://apify.com/store?search=shopify) with **installed-apps detection**
- Catalog + apps + reviews in one run
- Sub-30s per store, batchable up to 100
- Pay-per-event with no subscription

Use **[PPSPY](https://www.ppspy.com)** ($24/month) or **[Charm.io](https://charm.io)** ($299/month) if you prefer:

- A polished web UI without API access
- Flat monthly pricing instead of usage-based

Use **a generic [web scraper](https://apify.com/apify/web-scraper)** if you only need:

- Single-product scraping with custom selectors
- No Shopify-specific data model

> 💡 The actor is **5× cheaper than the next paid Apify Shopify scraper** at $0.002 per product, and the apps detection is unique on the platform. None of the 8 other [Shopify scrapers on Apify Store](https://apify.com/store?search=shopify) offer it.

***

### 🔗 Part of the KazKN ecosystem

This actor is part of the [KazKN](https://apify.com/kazkn) family of scrapers and MCP servers on Apify:

- [🧩 Shopify Scraper — Apps Spy + Products + Reviews](https://apify.com/kazkn/shopify-store-scraper) — this actor
- [Vinted Smart Scraper — Cross-Country Price Comparison](https://apify.com/kazkn/vinted-smart-scraper) — full Vinted intelligence across 26 markets
- [⚡ Vinted Turbo Scraper](https://apify.com/kazkn/vinted-turbo-scraper) — fastest Vinted URL-to-dataset workflow
- [Vinted MCP Server](https://apify.com/kazkn/vinted-mcp-server) — connect any AI agent ([Claude](https://claude.ai), [Cursor](https://cursor.com), [Windsurf](https://codeium.com/windsurf)) to Vinted data
- [App Store Scraper for Localization Gaps](https://apify.com/kazkn/apple-app-store-localization-scraper) — find US apps missing French, German, Spanish localizations
- [GPT Crawler MCP](https://apify.com/kazkn/gpt-crawler-mcp) — turn any website into a clean knowledge file for [ChatGPT](https://chatgpt.com), [Claude](https://claude.ai), or [RAG](https://www.anthropic.com/news/contextual-retrieval) pipelines

***

### ❓ Frequently asked questions

#### Is scraping Shopify legal?

Yes. The [`/products.json`](https://shopify.dev/docs/api/ajax/section/product) endpoint is a public, official [Shopify](https://www.shopify.com) feature documented by Shopify and consumed by [Google Shopping](https://shopping.google.com) for indexing. The actor does not bypass authentication, does not interact with Shopify's Admin API, and does not collect any buyer personal data. Read more on the [Apify legal blog](https://blog.apify.com/is-web-scraping-legal/).

#### Can you scrape Shopify products without an API key?

Yes. The actor uses only the public [storefront `/products.json` endpoint](https://shopify.dev/docs/api/ajax/section/product) — no app install, no [OAuth](https://shopify.dev/docs/apps/auth/oauth), no API key.

#### Does Shopify have an API?

Two: the [Admin API](https://shopify.dev/docs/api/admin) (requires OAuth + a per-store install — not used here) and the public [storefront `/products.json` endpoint](https://shopify.dev/docs/api/ajax/section/product) (used here).

#### How many products can I extract per store?

Up to 100,000 per store, paginated 250 at a time. Realistic Shopify catalogs have 50–10,000 products.

#### What if a store disables `/products.json`?

The actor automatically falls back to `sitemap_products_1.xml` and pulls each product via `/products/{handle}.json`. About 2 % of [Shopify Plus](https://www.shopify.com/plus) stores disable both — they show up as a graceful "no products fetched" record with diagnostic info.

#### Does the actor work on Shopify Plus stores?

Yes — most of the validation set is [Shopify Plus](https://www.shopify.com/plus) (Allbirds, Magic Spoon, Glossier, Kith, Princess Polly, Pinko, Aspesi, PDPaola).

#### How does the apps detector work?

It fetches the home page HTML and one product page HTML, then runs a library of regex patterns against the combined markup. Detection signals include CDN URLs, JavaScript globals, custom HTML attributes, and inline configuration objects.

#### How accurate is the apps detection?

Very high recall on apps that ship visible front-end widgets ([Yotpo](https://www.yotpo.com), [Judge.me](https://judge.me), [Recharge](https://rechargepayments.com), [Klaviyo](https://www.klaviyo.com), [Privy](https://www.privy.com), [Gorgias](https://www.gorgias.com), [Algolia](https://www.algolia.com), [Okendo](https://www.okendo.com)). Lower recall on stores that lazy-load apps via SPAs or use a heavily customized headless theme (e.g. Gymshark, Bombas). The detector reports actual reality — never fabricates apps.

#### What review providers are supported?

[Yotpo](https://www.yotpo.com), [Judge.me](https://judge.me), [Stamped](https://stamped.io), [Okendo](https://www.okendo.com), and [Loox](https://loox.io). The crawler picks the first provider the apps detector flagged for the store, or falls through the list until one returns reviews.

#### Can I scrape multi-region brands?

Yes. Pass each storefront URL separately (e.g. `https://allbirds.com`, `https://allbirds.eu`, `https://allbirds.co.uk`) and the actor reports the local currency in `store_meta` and treats each as an independent run.

#### What about Cloudflare-protected stores?

Some stores ([Tesla Shop](https://shop.tesla.com), [Manscaped](https://www.manscaped.com)) front their storefront with [Cloudflare](https://www.cloudflare.com). Toggle **Use Residential Proxy** on those — the actor routes through [Apify's residential pool](https://apify.com/proxy/residential).

#### Can I filter by collection?

Yes — set `Filter by Collection` to a collection handle (e.g. `sneakers`, `new-arrivals`) and the actor restricts the scrape to that collection only.

#### Can I run this on a schedule?

Yes — schedule runs in [Apify Scheduler](https://apify.com/schedules) with any cadence (hourly, daily, weekly). Export to webhook, [Google Sheets](https://apify.com/integrations/google-sheets), [S3](https://apify.com/integrations/aws-s3), [BigQuery](https://apify.com/integrations/google-bigquery), or any [Apify integration](https://apify.com/integrations).

#### Is data validated before being pushed?

Yes — every product, apps record, and review is validated against a [Zod](https://zod.dev) schema before being pushed. Invalid rows are logged and skipped, never stored.

#### What's the typical run time?

Standard level: 5–10 seconds per store on a single product page (250 products), parallelized across stores. 20 stores in standard level finishes in ~45 seconds in cloud benchmarks.

#### What's the memory footprint?

256–1024 MB tier. Real-world use is sub-300 MB even for 1,000-product stores. See [Apify resource pricing](https://apify.com/pricing).

#### What if the URL isn't a Shopify store?

The actor returns a diagnostic record with `error: "not_a_shopify_store"`, the signals it checked, and the HTTP status code. **You are never charged for non-Shopify URLs**.

#### How is this different from a generic scraper?

Generic scrapers like [Apify Web Scraper](https://apify.com/apify/web-scraper) extract one product at a time, do not understand Shopify's variants/collections/inventory model, and can't detect installed apps. This actor is purpose-built for Shopify with regex patterns tuned for the apps detection use case.

#### Can I use this from an AI agent?

Yes — connect any [MCP-compatible AI agent](https://modelcontextprotocol.io) ([Claude](https://claude.ai), [Cursor](https://cursor.com), [Windsurf](https://codeium.com/windsurf)) via the [Apify MCP server](https://apify.com/apify/actors-mcp-server). The agent can call this actor as a tool.

#### Can you add a specific app to the detector?

Yes — the patterns library is regex-based. Open an issue with a link to a store using the app and I'll add the regex.

***

### ⚖️ Is it legal to scrape Shopify?

The [`/products.json`](https://shopify.dev/docs/api/ajax/section/product) endpoint is public and documented by [Shopify](https://www.shopify.com). The actor does not bypass authentication, respects rate limits, and does not collect buyer personal data.

Note that personal data is protected by [GDPR](https://gdpr.eu) in the EU and other regulations worldwide. Do not scrape personal data unless you have a legitimate reason. If unsure, consult your lawyers. Suggested reading: [Is web scraping legal? — Apify Blog](https://blog.apify.com/is-web-scraping-legal/).

# Actor input Schema

## `store_urls` (type: `array`):

List of Shopify store URLs to scrape (e.g. https://allbirds.com). Works with any custom domain or \*.myshopify.com URL.

## `extract_level` (type: `string`):

Choose what to extract:

- Basic: products only
- Standard: products + installed apps detection
- Full: + reviews from Yotpo / Judge.me / Loox / Stamped
- Pro: + estimated revenue (advanced)

## `max_products_per_store` (type: `integer`):

Cap the number of products extracted per store. Set to 0 for unlimited.

## `products_collection_filter` (type: `string`):

Only extract products from a specific collection handle (e.g. 'sneakers', 'new-arrivals'). Leave empty for full catalog.

## `include_variants` (type: `boolean`):

Output one record per variant (size/color/SKU) instead of one per product.

## `max_reviews_per_product` (type: `integer`):

How many reviews to extract per product when extract\_level=full or pro.

## `max_concurrent_stores` (type: `integer`):

Number of stores scraped in parallel. Higher = faster but more memory.

## `max_concurrent_requests` (type: `integer`):

How many parallel HTTP calls per store. Increase for faster runs, decrease if you hit rate-limits.

## `use_residential_proxy` (type: `boolean`):

Default OFF - datacenter proxy is sufficient for /products.json. Enable only if you scrape >500 stores and hit rate-limits.

## `proxyConfiguration` (type: `object`):

Apify proxy configuration. Default: datacenter auto (no group). Override here or use the 'use\_residential\_proxy' switch above.

## `debug_mode` (type: `boolean`):

Enable verbose logging and save raw HTML snapshots to KV store on errors.

## Actor input object example

```json
{
  "store_urls": [
    "https://allbirds.com"
  ],
  "extract_level": "standard",
  "max_products_per_store": 250,
  "products_collection_filter": "",
  "include_variants": false,
  "max_reviews_per_product": 20,
  "max_concurrent_stores": 3,
  "max_concurrent_requests": 5,
  "use_residential_proxy": false,
  "proxyConfiguration": {
    "useApifyProxy": true
  },
  "debug_mode": false
}
```

# Actor output Schema

## `products` (type: `string`):

Default dataset with one row per Shopify product, including apps\_detected stack, store metadata, prices, variants and images.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "store_urls": [
        "https://allbirds.com"
    ],
    "proxyConfiguration": {
        "useApifyProxy": true
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("kazkn/shopify-store-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "store_urls": ["https://allbirds.com"],
    "proxyConfiguration": { "useApifyProxy": True },
}

# Run the Actor and wait for it to finish
run = client.actor("kazkn/shopify-store-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "store_urls": [
    "https://allbirds.com"
  ],
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}' |
apify call kazkn/shopify-store-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=kazkn/shopify-store-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "🧩 Shopify Product Scraper - Apps Spy + Reviews",
        "description": "Detect which apps any Shopify store has installed (Klaviyo, Recharge, Yotpo, Privy + 30 more). Plus full product catalog & reviews. No login. 5x cheaper.",
        "version": "1.8",
        "x-build-id": "xh7Go6xnifpmlRfb3"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/kazkn~shopify-store-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-kazkn-shopify-store-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/kazkn~shopify-store-scraper/runs": {
            "post": {
                "operationId": "runs-sync-kazkn-shopify-store-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/kazkn~shopify-store-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-kazkn-shopify-store-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "store_urls"
                ],
                "properties": {
                    "store_urls": {
                        "title": "Shopify Store URLs",
                        "minItems": 1,
                        "maxItems": 100,
                        "type": "array",
                        "description": "List of Shopify store URLs to scrape (e.g. https://allbirds.com). Works with any custom domain or *.myshopify.com URL.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "extract_level": {
                        "title": "Extraction Level",
                        "enum": [
                            "basic",
                            "standard",
                            "full",
                            "pro"
                        ],
                        "type": "string",
                        "description": "Choose what to extract:\n- Basic: products only\n- Standard: products + installed apps detection\n- Full: + reviews from Yotpo / Judge.me / Loox / Stamped\n- Pro: + estimated revenue (advanced)",
                        "default": "standard"
                    },
                    "max_products_per_store": {
                        "title": "Max Products Per Store",
                        "minimum": 0,
                        "maximum": 100000,
                        "type": "integer",
                        "description": "Cap the number of products extracted per store. Set to 0 for unlimited.",
                        "default": 250
                    },
                    "products_collection_filter": {
                        "title": "Filter by Collection (optional)",
                        "type": "string",
                        "description": "Only extract products from a specific collection handle (e.g. 'sneakers', 'new-arrivals'). Leave empty for full catalog.",
                        "default": ""
                    },
                    "include_variants": {
                        "title": "Include Product Variants",
                        "type": "boolean",
                        "description": "Output one record per variant (size/color/SKU) instead of one per product.",
                        "default": false
                    },
                    "max_reviews_per_product": {
                        "title": "Max Reviews Per Product (Full/Pro level only)",
                        "minimum": 1,
                        "maximum": 500,
                        "type": "integer",
                        "description": "How many reviews to extract per product when extract_level=full or pro.",
                        "default": 20
                    },
                    "max_concurrent_stores": {
                        "title": "Concurrent Stores (multi-store mode)",
                        "minimum": 1,
                        "maximum": 10,
                        "type": "integer",
                        "description": "Number of stores scraped in parallel. Higher = faster but more memory.",
                        "default": 3
                    },
                    "max_concurrent_requests": {
                        "title": "Concurrent Requests Per Store",
                        "minimum": 1,
                        "maximum": 20,
                        "type": "integer",
                        "description": "How many parallel HTTP calls per store. Increase for faster runs, decrease if you hit rate-limits.",
                        "default": 5
                    },
                    "use_residential_proxy": {
                        "title": "Use Residential Proxy (advanced)",
                        "type": "boolean",
                        "description": "Default OFF - datacenter proxy is sufficient for /products.json. Enable only if you scrape >500 stores and hit rate-limits.",
                        "default": false
                    },
                    "proxyConfiguration": {
                        "title": "Proxy Configuration",
                        "type": "object",
                        "description": "Apify proxy configuration. Default: datacenter auto (no group). Override here or use the 'use_residential_proxy' switch above."
                    },
                    "debug_mode": {
                        "title": "Debug Mode",
                        "type": "boolean",
                        "description": "Enable verbose logging and save raw HTML snapshots to KV store on errors.",
                        "default": false
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
