Shopify Store Intelligence — Catalog + Pricing
Pricing
$1.00 / 1,000 shopify product extracteds
Shopify Store Intelligence — Catalog + Pricing
Snapshot any Shopify storefront's full public catalog. One row per product with title, vendor, type, tags, all variants (SKU, price, compare_at_price, available), images. Skips non-Shopify domains gracefully. Source: /products.json + /collections.json (public, no auth).
Shopify Store Intelligence — Catalog + Pricing Snapshot
Pull the full public catalog of any Shopify-powered storefront — all products,
variants, prices, SKUs, vendor portfolio, and stock signals — in one run. No
login, no HTML scraping, no headless browser. Just the open /products.json
endpoint that Shopify exposes for every store by design.
What you get
One row per product (default) — or one row per variant (SKU) if you'd rather join in a spreadsheet. Every row carries:
- Identifiers:
productId,handle,url,storeDomain - Catalog metadata:
title,vendor,productType,tags,imageCount - Pricing signal:
minPrice,maxPrice,anyOnSale, per-variantprice+compareAtPrice - Stock signal:
variantsInStock/variantsTotal, per-variantavailable - Timestamps:
publishedAt,createdAt,updatedAt
Plus a per-store summary in OUTPUT with vendor distribution (top 10), product
types breakdown, products on sale / in stock, and an optional /collections.json
snapshot.
Why this matters
Shopify powers >4 million live stores. Each one of them silently publishes its
full product catalog at /products.json, paginated 250 at a time. Most growth
teams don't know this — and the ones that do are stitching together one-off
scripts. This actor turns that into a single, reliable, paginated, billable
data source.
Built for:
- Competitive intelligence — track a competitor's catalog every week: what's new, what dropped, what went on sale.
- DTC / dropshipping research — discover niches by surveying which vendors are inside which collections across a list of stores.
- Pricing surveillance — compare your prices vs. a curated panel of peers.
- M&A / investor due diligence — snapshot a target's full SKU count, vendor mix, and pricing distribution from a single run.
Input
| Field | Required | Default | Description |
|---|---|---|---|
stores | yes | — | List of domains or URLs. "gymshark.com", "https://allbirds.com", "shop.example.com" — all valid. |
maxProductsPerStore | no | 5000 | Hard cap per store. Shopify itself enforces a page index ceiling of ~150 (true max ≈ 37 500 products / store). |
includeCollections | no | true | Also pull /collections.json into the per-store summary. |
rowMode | no | per_product | per_product (one row, all variants nested) or per_variant (one row per SKU). |
includeBodyHtml | no | false | Include the long-form HTML product description. Skip unless you need it — adds 2–10 KB / row. |
detectStack | no | false | Fetch the store's homepage and detect installed apps + tech stack. Adds installedApps and techStack to the per-store summary. One extra HTTP request per store, no billing impact. |
Output
Dataset — one row per product (or per variant).
Key-value store — OUTPUT with per-store summaries and the billing ledger
summary; BILLING_LOG with the full audit trail of every charge call.
Pricing
PAY_PER_EVENT, $0.001 per product extracted (product_extracted). A 5 000-SKU
store costs ~$5. The per_variant mode is free of extra charge — billing is
per product, not per variant.
v0.2 — what's new
Stack detection. Set detectStack: true to fetch the store's homepage
once and surface the installed apps + tech stack in the per-store summary:
"installedApps": ["klaviyo", "recharge", "loox", "gorgias", "meta_pixel"],"techStack": {"checkout": ["shopify"],"subscriptions": ["recharge"],"reviews": ["loox"],"email": ["klaviyo"],"chat": ["gorgias"],"pixels": ["meta_pixel", "tiktok_pixel"],"analytics": ["ga4"]}
Tells covered: Recharge / Ordergroove / Appstle / Loop (subscriptions); Loox / Judge.me / Yotpo / Stamped / Okendo / Reviews.io (reviews); Klaviyo / Privy / Justuno / Omnisend / Mailchimp (email); Gorgias / Intercom / Tidio / Crisp / Zendesk / Drift (chat); Smile.io / Refersion (loyalty); GA4 / Northbeam / TripleWhale (analytics); Meta / TikTok / Pinterest (pixels); Shogun / PageFly / GemPages (page builders); Rebuy (upsell).
Adds one extra GET request per store. No billing impact — billing
still charges per product_extracted.
Notes & limits
- Non-Shopify URLs are skipped, not failed. If a domain doesn't serve
/products.json, the store appears in OUTPUT with askippedreason (404, non-JSON, network) and no rows or charges are emitted. - Shopify hard limit. The public storefront API caps the
pageparameter around 150 — a true ceiling of ~37 500 products. The actor stops cleanly when it hits that. - No personal data. This endpoint is public catalog data designed to be
served to anyone who lands on the store. No customer info, no order info,
no inventory counts beyond
available: true/false. - Polite by default. Uses a normal browser User-Agent and follows redirects; no Apify Proxy needed because the endpoint is openly served.
Source
Each Shopify storefront exposes:
GET /products.json?limit=250&page=N— paginated product feed.GET /collections.json?limit=250— collection metadata.
These are public endpoints, served the same way to scrapers, plugins, and your browser's view-source.