Shopify Products Scraper avatar

Shopify Products Scraper

Pricing

$1.00 / 1,000 product returneds

Go to Apify Store
Shopify Products Scraper

Shopify Products Scraper

Scrape the full product catalog of any Shopify store via the public /products.json endpoint. One clean row per product: title, URL, vendor, type, tags, description, price range, stock, variants and images. No key, no login, no anti-bot.

Pricing

$1.00 / 1,000 product returneds

Rating

5.0

(1)

Developer

Dami's Studio

Dami's Studio

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

1

Monthly active users

3 days ago

Last modified

Share

Scrape the entire product catalog of any Shopify store — no API key, no login, no anti-bot. Every Shopify storefront exposes a public /products.json endpoint; this actor paginates it and returns one clean, flat row per product.

Point it at a competitor, a brand you resell, or your own store, and get structured data you can drop straight into a spreadsheet or feed.

What you get per product

store, productId, title, handle, url (canonical /products/{handle} link), vendor, productType, tags[], description (HTML stripped to plain text), createdAt, publishedAt, priceMin, priceMax, available (any variant in stock), variantCount, variants[] (title, sku, price, compareAtPrice, available), and images[] (URLs).

Input

FieldNotes
storeUrlsOne or more stores. Bare domain (shop.com), full URL, or any store page (/collections/x) — each is normalized to the origin.
maxItemsMax products per store (default 1000). Paged 250 at a time until the cap or the catalog ends.
proxyConfigurationOptional, off by default. The endpoint is public with no anti-bot, so no proxy is needed. Only enable one if you hit IP rate limits on very large catalogs.

Output

One dataset row per product, deduped by productId within each store. Each charged product is ok: true.

If a URL is not a Shopify store (no /products.json, returns HTML or 404), the actor emits a single clean diagnostic row (errorCode: "NOT_SHOPIFY") for that store and moves on — it never crashes, and it never charges for it. Empty catalogs return NO_RESULTS; bad input returns BAD_INPUT.

Example

{ "storeUrls": ["https://www.gymshark.com", "https://www.allbirds.com"], "maxItems": 500 }

Notes

  • Shopify caps /products.json at 250 products per page; the actor handles pagination automatically.
  • Only published products appear in /products.json — unpublished/draft products are not exposed by Shopify.
  • inventory_quantity is intentionally not in the public feed; per-variant stock is reported as the boolean available.
  • Some stores password-protect or disable the JSON endpoint; those come back as NOT_SHOPIFY rather than failing the run.
  • Most fields mirror Shopify's feed directly, so a few can be null when the store omits them: title, handle, url, vendor, productType, and priceMin/priceMax (null when a product has no priced variants). tags, variants, and images are always arrays (possibly empty).
  • Pricing: pay-per-result — you are charged once per product row (ok: true). Diagnostic rows (NOT_SHOPIFY, NO_RESULTS, BAD_INPUT, network/block errors) are ok: false and are never charged.