Shopify Products Scraper
Pricing
from $0.85 / 1,000 products
Shopify Products Scraper
Scrape every product from any Shopify store: title, vendor, price, compare-at price, variants, stock status, and images. Just enter the store domain. No API keys or category URLs needed. Export data, run via API, schedule and monitor runs, or integrate with other tools.
Pricing
from $0.85 / 1,000 products
Rating
5.0
(7)
Developer
Trove Vault
Maintained by CommunityActor stats
9
Bookmarked
148
Total users
44
Monthly active users
14 days ago
Last modified
Categories
Share
Shopify Products Scraper: Full Catalogue from Any Store Domain
Extract every product from any Shopify store using only the store domain. The actor calls Shopify's native /products.json endpoint and returns 23 structured fields per product — prices, stock levels, images, variants, tags, and plain-text description. No collection URLs, no browser rendering, no authentication required.
Why use Shopify Products Scraper?
Standard collection-based scrapers require a separate URL for every collection page and still miss cross-listed products. This actor uses /products.json, Shopify's built-in storefront API, which returns the full catalogue from a single domain input — with automatic pagination for stores with 10,000+ products.
| Feature | Collection scrapers | Browser scrapers | Shopify Products Scraper |
|---|---|---|---|
| Input required | One URL per collection | Product page URLs | Store domain only |
| Full catalogue coverage | Requires all collection URLs | Slow, page by page | Automatic via /products.json |
| Multi-store support | Separate run per store | Separate run per store | One run, multiple domains |
| Speed | Moderate | Slow (browser rendering) | Fast (direct API calls) |
| Compare-at price | Often missing | Unreliable | Native Shopify field |
| API key required | Sometimes | Never | Never |
What data does Shopify Products Scraper extract?
| Field | Description |
|---|---|
store | Source store domain |
title | Product title |
vendor | Brand or manufacturer name |
url | Full product page URL |
featuredImage | URL of the first product image |
imageCount | Total number of product images |
imageAltTexts | Alt text from every product image (array) |
currency | Store currency code (ISO 4217) |
priceMin | Lowest variant price |
priceMax | Highest variant price |
compareAtPrice | Original price before discount; null if not on sale |
onSale | true when compareAtPrice is set |
available | true when at least one variant is in stock |
fullyOutOfStock | true when every variant is sold out |
requiresShipping | true for physical products |
weightAndUnit | Weight in grams from the first variant |
variantCount | Total number of variants |
options | Variant dimension names (Size, Color, Material…) |
productType | Category label set by the merchant |
tags | Product tags (array) |
description | Plain-text description with HTML stripped |
publishedAt | ISO 8601 publish date |
updatedAt | ISO 8601 last-modified date |
Use cases
Competitor price monitoring
Track price changes across a rival store's full catalogue. Use compareAtPrice and onSale to identify discount patterns and promotional timing. Schedule daily or weekly runs to build a price history over time.
Inventory and stock tracking
Monitor available and fullyOutOfStock to time your own promotions around a competitor's stockouts, or set up back-in-stock alerts for products you follow.
Product research and catalogue analysis
Audit a competitor's SKU count, product types, pricing strategy across variants, and the tags they assign for Shopify search and filtering. Export to Excel, Google Sheets, or BigQuery for further analysis.
Multi-store price comparison
Add multiple domains in one run. Every output row includes a store field, making side-by-side comparison in a spreadsheet or BI tool immediate — no joins required.
New product and launch detection
Run weekly and filter by publishedAt to isolate products added since your last run. Diff two consecutive dataset snapshots to detect additions, removals, and price changes in one pass.
How to scrape a Shopify store
- Enter one or more store domains — e.g.
gymshark.comordeathwishcoffee.com - Set Max Products per store; leave at
0to scrape the full catalogue - Leave proxy disabled for most stores; enable Residential only if a store returns HTTP 403
- Click Start and download results as JSON, CSV, or Excel when the run completes
{"domains": ["gymshark.com", "deathwishcoffee.com"],"maxProducts": 500}
Input
| Field | Type | Default | Description |
|---|---|---|---|
domains | Array | required | One or more Shopify store domains. Accepts bare hostnames, https:// URLs, and .myshopify.com subdomains |
maxProducts | Number | 0 | Max products per store; 0 means no limit. Pages of 250 are fetched until the limit is reached |
proxyConfiguration | Object | disabled | Proxy settings. Enable Apify Proxy (Residential group) only when a store returns HTTP 403 |
datasetId | String | — | ID of an existing dataset to append results to, in addition to the default run dataset |
runId | String | — | Parent run ID copied into output rows |
Output
One row per product. All 23 fields are present on every row; nullable fields return null rather than being omitted.
{"store": "gymshark.com","title": "Vital Seamless 2.0 Shorts","vendor": "Gymshark","url": "https://gymshark.com/products/vital-seamless-2-0-shorts","featuredImage": "https://cdn.shopify.com/vital-seamless-shorts.jpg","imageCount": 6,"imageAltTexts": ["Vital Seamless 2.0 Shorts - Black", "Back view - Black"],"currency": "GBP","priceMin": 45.00,"priceMax": 45.00,"compareAtPrice": null,"onSale": false,"available": true,"fullyOutOfStock": false,"requiresShipping": true,"weightAndUnit": { "grams": 180 },"variantCount": 8,"options": ["Size"],"productType": "Shorts","tags": ["bottoms", "seamless", "training"],"description": "Crafted with seamless construction for a second-skin feel.","publishedAt": "2023-06-14T09:00:00Z","updatedAt": "2024-01-10T11:30:00Z"}
API
Trigger runs from any script, CI pipeline, or scheduled job using your Apify API token.
curl -X POST "https://api.apify.com/v2/acts/trovevault~shopify-products-scraper/runs" \-H "Authorization: Bearer YOUR_APIFY_TOKEN" \-H "Content-Type: application/json" \-d '{"domains":["gymshark.com"],"maxProducts":500}'
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_APIFY_TOKEN' });const run = await client.actor('trovevault~shopify-products-scraper').call({domains: ['gymshark.com'],maxProducts: 500,});const { items } = await client.dataset(run.defaultDatasetId).listItems();
Troubleshooting
HTTP 403 — store blocked the request Enable Apify Proxy (Residential group) in the Proxy Configuration input. Large Shopify brands including Gymshark, Allbirds, and MVMT block datacenter IP ranges; residential proxies resolve this in most cases.
HTTP 404 — /products.json not found
The store may use a headless Shopify setup or have disabled the endpoint on its custom domain. The actor automatically detects the underlying .myshopify.com subdomain and retries. If the 404 persists, confirm the site is a Shopify store — WooCommerce, Magento, and BigCommerce do not expose /products.json.
Fewer products than expected
Some merchants restrict /products.json to their public catalogue. Draft products, hidden listings, password-protected items, and wholesale-only SKUs are not returned by the endpoint.
Run fails or returns a network error Add a proxy configuration and retry. If the problem persists, open a ticket on the actor's Issues tab and include the run ID and the domain that failed.
FAQ
Does it work on all Shopify stores?
It works on any store using the standard Shopify storefront. Headless stores (Shopify Hydrogen) may have /products.json disabled on their custom domain; the actor discovers and uses the .myshopify.com subdomain automatically.
Does it return individual variant data?
One row per product. The options field lists variant dimensions (Size, Color, etc.), variantCount gives the total, and priceMin/priceMax span all variant prices.
How do I detect new products between runs?
Schedule the actor weekly. Filter output by publishedAt to find products published since your last run, or diff two consecutive dataset snapshots to detect additions, removals, and price changes.
Can I use this via the API or an AI assistant? Yes to both. See the API section above for curl and JavaScript examples. Via the Apify MCP server, you can trigger runs directly from Claude, ChatGPT, or any MCP-compatible assistant.
Is scraping Shopify product data legal?
This actor calls only Shopify's public /products.json — the same data every browser fetches when loading a storefront. Accessing public data is generally lawful. Always review the specific store's terms of service for large-scale commercial use.
Limitations
- One row per product, not per variant.
priceMin,priceMax, andvariantCountsummarise variant data; individual SKU-level records require further enrichment. - Public catalogue only. Draft, hidden, password-protected, and wholesale-only products are not exposed by
/products.json. - Shopify stores only. WooCommerce, Magento, and BigCommerce do not use this endpoint.
Related actors
- WooCommerce Products Scraper — same structured output for WooCommerce stores
- E-Commerce Tech Stack Detector — identify whether a store runs Shopify, WooCommerce, or other platforms before scraping
Changelog
v0.1
- Initial release with full catalogue scraping, automatic pagination, multi-store support, and
.myshopify.comdiscovery fallback for custom domains