Shopify Products Scraper avatar

Shopify Products Scraper

Pricing

from $0.85 / 1,000 products

Go to Apify Store
Shopify Products Scraper

Shopify Products Scraper

Scrape every product from any Shopify store: title, vendor, price, compare-at price, variants, stock status, and images. Just enter the store domain. No API keys or category URLs needed. Export data, run via API, schedule and monitor runs, or integrate with other tools.

Pricing

from $0.85 / 1,000 products

Rating

5.0

(7)

Developer

Trove Vault

Trove Vault

Maintained by Community

Actor stats

9

Bookmarked

178

Total users

51

Monthly active users

a day ago

Last modified

Categories

Share

Shopify Products Scraper: Full Catalogue from Any Store Domain

Shopify Products Scraper extracts a public Shopify store catalogue from one store domain. It calls Shopify's native /products.json storefront endpoint, paginates automatically, and returns one structured row per product with prices, sale status, stock flags, images, variants, tags, descriptions, and publish/update timestamps.

Use it when you need competitor product data, catalogue monitoring, price tracking, or multi-store comparison without collecting collection URLs or running a browser.

Why use this actor

Most Shopify product scrapers require collection URLs or product page URLs. This actor starts from the store domain, uses Shopify's public product feed, and returns a normalized dataset for CSV, Excel, BigQuery, dashboards, or downstream Apify actors.

CapabilityCollection scrapersBrowser scrapersShopify Products Scraper
Input neededOne URL per collectionProduct page URLsStore domain only
Catalogue coverageDepends on supplied collectionsPage by pageAutomatic /products.json pagination
Multi-store runsUsually separate runsUsually separate runsMultiple domains in one run
Speed and costMediumSlower and costlierDirect HTTP requests
Compare-at pricesOften missingUnreliableNative Shopify field
API keySometimes requiredNot requiredNot required

Workflow

store domain -> Shopify /products.json pages -> normalized product rows -> Apify dataset/API

What it extracts

Each output row represents one product. Nullable fields are returned as null; arrays are returned as arrays.

FieldWhat it means
storeDomain that produced the row
titleProduct title
vendorBrand or vendor set by the merchant
urlProduct page URL
featuredImageFirst product image URL
imageCountNumber of product images
imageAltTextsNon-empty image alt text
currencyISO 4217 currency when available
priceMin, priceMaxLowest and highest variant price
compareAtPriceHighest compare-at price, or null
onSaletrue when any variant has a compare-at price
availabletrue when at least one variant is available
fullyOutOfStocktrue when every variant is unavailable
requiresShippingtrue when at least one variant requires shipping
weightAndUnitFirst variant weight in grams
variantCountNumber of variants
optionsVariant option names, such as Size or Color
productTypeMerchant-defined product type
tagsShopify product tags
descriptionPlain-text product description with HTML stripped
publishedAt, updatedAtPublish and update timestamps
runIdOptional parent run ID copied from input

Use cases

Competitor price monitoring

Track a rival Shopify catalogue over time. Schedule daily or weekly runs and compare priceMin, priceMax, compareAtPrice, and onSale to detect discounts, price increases, and promotion patterns.

Inventory and stock tracking

Monitor available and fullyOutOfStock across a store. Use the dataset to spot stockouts, restocks, or products that frequently sell out.

Product research and assortment analysis

Audit product counts, brands, product types, tags, variant counts, and price bands for category research, supplier checks, marketplace planning, and SKU benchmarking.

Multi-store comparison

Add several domains in one run. The store field is included on every row, so comparisons do not need extra joins.

New product detection

Run on a schedule and filter by publishedAt, or diff consecutive datasets, to identify new, removed, and changed listings.

How to scrape a Shopify store

  1. Enter one or more store domains, such as gymshark.com or deathwishcoffee.com.
  2. Set Max Products per store. Use 50 to test, 500 for a sample, or 0 for the full public catalogue.
  3. Leave proxy disabled unless a store returns HTTP 403.
  4. Start the actor and export the dataset as JSON, CSV, Excel, or through the Apify API.

Input

{
"domains": ["gymshark.com", "deathwishcoffee.com"],
"maxProducts": 500,
"proxyConfiguration": { "useApifyProxy": false }
}
FieldTypeDefaultDescription
domainsArrayrequiredShopify store domains. Accepts bare hostnames, full URLs, and .myshopify.com subdomains.
maxProductsNumber0Maximum products per store. 0 means no limit. Shopify pages are fetched in batches of 250.
proxyConfigurationObjectdisabledProxy settings. Enable Apify Proxy Residential only when a domain returns HTTP 403.
datasetIdStringoptionalExisting Apify dataset ID to append rows to alongside the default dataset.
runIdStringoptionalParent run ID copied into each output row for pipeline traceability.

Output

The actor returns one dataset item per product.

{
"store": "gymshark.com",
"vendor": "Gymshark",
"title": "Vital Seamless 2.0 Shorts",
"url": "https://gymshark.com/products/vital-seamless-2-0-shorts",
"featuredImage": "https://cdn.shopify.com/vital-seamless-shorts.jpg",
"imageCount": 6,
"imageAltTexts": ["Vital Seamless 2.0 Shorts - Black", "Back view - Black"],
"currency": "GBP",
"priceMin": 45,
"priceMax": 45,
"compareAtPrice": null,
"onSale": false,
"available": true,
"fullyOutOfStock": false,
"requiresShipping": true,
"weightAndUnit": { "grams": 180 },
"variantCount": 8,
"options": ["Size"],
"productType": "Shorts",
"tags": ["bottoms", "seamless", "training"],
"description": "Crafted with seamless construction for a second-skin feel.",
"publishedAt": "2023-06-14T09:00:00Z",
"updatedAt": "2024-01-10T11:30:00Z"
}

API examples

Trigger a run with curl:

curl -X POST "https://api.apify.com/v2/acts/trovevault~shopify-products-scraper/runs" \
-H "Authorization: Bearer YOUR_APIFY_TOKEN" \
-H "Content-Type: application/json" \
-d '{"domains":["gymshark.com"],"maxProducts":500}'

Run with the JavaScript client and read the dataset:

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_APIFY_TOKEN' });
const run = await client.actor('trovevault~shopify-products-scraper').call({
domains: ['gymshark.com'],
maxProducts: 500,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items.slice(0, 3));

Troubleshooting and support

HTTP 403 or IP blocked Enable Apify Proxy with the Residential group in proxyConfiguration. Large brands sometimes block datacenter IP ranges.

HTTP 404 on /products.json The actor tries three fallbacks: scanning the homepage for a .myshopify.com domain, trying the parent domain, and guessing the Shopify subdomain from the brand label. If all fail, the site may not be Shopify or may have disabled the endpoint.

No rows or fewer products than expected Shopify returns only the public catalogue exposed by the storefront. Draft, hidden, password-protected, wholesale-only, and feed-excluded products will not appear.

Currency is null The actor reads currency from /cart.js. If a store blocks or customizes that endpoint, rows are still returned but price fields have no confirmed currency.

Network errors or timeouts Retry with a lower maxProducts value. If the same domain keeps failing, enable Residential proxy and check the run log.

Need help Open the actor's Issues tab with the run ID, input domain, and log error.

FAQ

Does it work on every Shopify store? It works on stores that expose Shopify's public product feed. Some headless, password-protected, wholesale, or customized stores restrict it.

Does it return variant-level rows? No. It returns one row per product. Variant data is summarized with variantCount, options, prices, availability, shipping, and weight.

Can I scrape several stores at once? Yes. Add multiple domains to domains; each row includes store.

Can I append output to an existing dataset? Yes. Pass datasetId to append rows to an existing dataset. Pass runId to link rows to a parent pipeline run.

Can I use this through an AI assistant? Yes. Use the Apify API, JavaScript client, or Apify MCP server from MCP-compatible assistants.

Is scraping Shopify product data legal? The actor requests public storefront endpoints. Accessing public data is generally lawful, but review the target site's terms and your own use case.

Limitations

  • One row per product, not per variant or SKU.
  • Only public Shopify catalogue data is returned.
  • Some headless, password-protected, wholesale, or restricted stores may not expose /products.json.
  • compareAtPrice is only present when the merchant sets compare-at pricing.
  • Currency may be null when /cart.js is unavailable.
  • The actor does not crawl non-Shopify platforms such as WooCommerce, Magento, BigCommerce, Amazon, or custom storefronts.

Changelog

v0.1

  • Full catalogue scraping from Shopify /products.json
  • Automatic pagination and multi-store input
  • Domain fallback for custom, regional, and .myshopify.com domains
  • Structured product output with prices, stock flags, images, variants, tags, and timestamps