Shopify Products Scraper avatar

Shopify Products Scraper

Pricing

from $0.85 / 1,000 products

Go to Apify Store
Shopify Products Scraper

Shopify Products Scraper

Scrape every product from any Shopify store: title, vendor, price, compare-at price, variants, stock status, and images. Just enter the store domain. No API keys or category URLs needed. Export data, run via API, schedule and monitor runs, or integrate with other tools.

Pricing

from $0.85 / 1,000 products

Rating

5.0

(7)

Developer

Trove Vault

Trove Vault

Maintained by Community

Actor stats

9

Bookmarked

148

Total users

44

Monthly active users

14 days ago

Last modified

Categories

Share

Shopify Products Scraper: Full Catalogue from Any Store Domain

Extract every product from any Shopify store using only the store domain. The actor calls Shopify's native /products.json endpoint and returns 23 structured fields per product — prices, stock levels, images, variants, tags, and plain-text description. No collection URLs, no browser rendering, no authentication required.


Why use Shopify Products Scraper?

Standard collection-based scrapers require a separate URL for every collection page and still miss cross-listed products. This actor uses /products.json, Shopify's built-in storefront API, which returns the full catalogue from a single domain input — with automatic pagination for stores with 10,000+ products.

FeatureCollection scrapersBrowser scrapersShopify Products Scraper
Input requiredOne URL per collectionProduct page URLsStore domain only
Full catalogue coverageRequires all collection URLsSlow, page by pageAutomatic via /products.json
Multi-store supportSeparate run per storeSeparate run per storeOne run, multiple domains
SpeedModerateSlow (browser rendering)Fast (direct API calls)
Compare-at priceOften missingUnreliableNative Shopify field
API key requiredSometimesNeverNever

What data does Shopify Products Scraper extract?

FieldDescription
storeSource store domain
titleProduct title
vendorBrand or manufacturer name
urlFull product page URL
featuredImageURL of the first product image
imageCountTotal number of product images
imageAltTextsAlt text from every product image (array)
currencyStore currency code (ISO 4217)
priceMinLowest variant price
priceMaxHighest variant price
compareAtPriceOriginal price before discount; null if not on sale
onSaletrue when compareAtPrice is set
availabletrue when at least one variant is in stock
fullyOutOfStocktrue when every variant is sold out
requiresShippingtrue for physical products
weightAndUnitWeight in grams from the first variant
variantCountTotal number of variants
optionsVariant dimension names (Size, Color, Material…)
productTypeCategory label set by the merchant
tagsProduct tags (array)
descriptionPlain-text description with HTML stripped
publishedAtISO 8601 publish date
updatedAtISO 8601 last-modified date

Use cases

Competitor price monitoring

Track price changes across a rival store's full catalogue. Use compareAtPrice and onSale to identify discount patterns and promotional timing. Schedule daily or weekly runs to build a price history over time.

Inventory and stock tracking

Monitor available and fullyOutOfStock to time your own promotions around a competitor's stockouts, or set up back-in-stock alerts for products you follow.

Product research and catalogue analysis

Audit a competitor's SKU count, product types, pricing strategy across variants, and the tags they assign for Shopify search and filtering. Export to Excel, Google Sheets, or BigQuery for further analysis.

Multi-store price comparison

Add multiple domains in one run. Every output row includes a store field, making side-by-side comparison in a spreadsheet or BI tool immediate — no joins required.

New product and launch detection

Run weekly and filter by publishedAt to isolate products added since your last run. Diff two consecutive dataset snapshots to detect additions, removals, and price changes in one pass.


How to scrape a Shopify store

  1. Enter one or more store domains — e.g. gymshark.com or deathwishcoffee.com
  2. Set Max Products per store; leave at 0 to scrape the full catalogue
  3. Leave proxy disabled for most stores; enable Residential only if a store returns HTTP 403
  4. Click Start and download results as JSON, CSV, or Excel when the run completes
{
"domains": ["gymshark.com", "deathwishcoffee.com"],
"maxProducts": 500
}

Input

FieldTypeDefaultDescription
domainsArrayrequiredOne or more Shopify store domains. Accepts bare hostnames, https:// URLs, and .myshopify.com subdomains
maxProductsNumber0Max products per store; 0 means no limit. Pages of 250 are fetched until the limit is reached
proxyConfigurationObjectdisabledProxy settings. Enable Apify Proxy (Residential group) only when a store returns HTTP 403
datasetIdStringID of an existing dataset to append results to, in addition to the default run dataset
runIdStringParent run ID copied into output rows

Output

One row per product. All 23 fields are present on every row; nullable fields return null rather than being omitted.

{
"store": "gymshark.com",
"title": "Vital Seamless 2.0 Shorts",
"vendor": "Gymshark",
"url": "https://gymshark.com/products/vital-seamless-2-0-shorts",
"featuredImage": "https://cdn.shopify.com/vital-seamless-shorts.jpg",
"imageCount": 6,
"imageAltTexts": ["Vital Seamless 2.0 Shorts - Black", "Back view - Black"],
"currency": "GBP",
"priceMin": 45.00,
"priceMax": 45.00,
"compareAtPrice": null,
"onSale": false,
"available": true,
"fullyOutOfStock": false,
"requiresShipping": true,
"weightAndUnit": { "grams": 180 },
"variantCount": 8,
"options": ["Size"],
"productType": "Shorts",
"tags": ["bottoms", "seamless", "training"],
"description": "Crafted with seamless construction for a second-skin feel.",
"publishedAt": "2023-06-14T09:00:00Z",
"updatedAt": "2024-01-10T11:30:00Z"
}

API

Trigger runs from any script, CI pipeline, or scheduled job using your Apify API token.

curl -X POST "https://api.apify.com/v2/acts/trovevault~shopify-products-scraper/runs" \
-H "Authorization: Bearer YOUR_APIFY_TOKEN" \
-H "Content-Type: application/json" \
-d '{"domains":["gymshark.com"],"maxProducts":500}'
import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_APIFY_TOKEN' });
const run = await client.actor('trovevault~shopify-products-scraper').call({
domains: ['gymshark.com'],
maxProducts: 500,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();

Troubleshooting

HTTP 403 — store blocked the request Enable Apify Proxy (Residential group) in the Proxy Configuration input. Large Shopify brands including Gymshark, Allbirds, and MVMT block datacenter IP ranges; residential proxies resolve this in most cases.

HTTP 404 — /products.json not found The store may use a headless Shopify setup or have disabled the endpoint on its custom domain. The actor automatically detects the underlying .myshopify.com subdomain and retries. If the 404 persists, confirm the site is a Shopify store — WooCommerce, Magento, and BigCommerce do not expose /products.json.

Fewer products than expected Some merchants restrict /products.json to their public catalogue. Draft products, hidden listings, password-protected items, and wholesale-only SKUs are not returned by the endpoint.

Run fails or returns a network error Add a proxy configuration and retry. If the problem persists, open a ticket on the actor's Issues tab and include the run ID and the domain that failed.


FAQ

Does it work on all Shopify stores? It works on any store using the standard Shopify storefront. Headless stores (Shopify Hydrogen) may have /products.json disabled on their custom domain; the actor discovers and uses the .myshopify.com subdomain automatically.

Does it return individual variant data? One row per product. The options field lists variant dimensions (Size, Color, etc.), variantCount gives the total, and priceMin/priceMax span all variant prices.

How do I detect new products between runs? Schedule the actor weekly. Filter output by publishedAt to find products published since your last run, or diff two consecutive dataset snapshots to detect additions, removals, and price changes.

Can I use this via the API or an AI assistant? Yes to both. See the API section above for curl and JavaScript examples. Via the Apify MCP server, you can trigger runs directly from Claude, ChatGPT, or any MCP-compatible assistant.

Is scraping Shopify product data legal? This actor calls only Shopify's public /products.json — the same data every browser fetches when loading a storefront. Accessing public data is generally lawful. Always review the specific store's terms of service for large-scale commercial use.


Limitations

  • One row per product, not per variant. priceMin, priceMax, and variantCount summarise variant data; individual SKU-level records require further enrichment.
  • Public catalogue only. Draft, hidden, password-protected, and wholesale-only products are not exposed by /products.json.
  • Shopify stores only. WooCommerce, Magento, and BigCommerce do not use this endpoint.


Changelog

v0.1

  • Initial release with full catalogue scraping, automatic pagination, multi-store support, and .myshopify.com discovery fallback for custom domains