Shopify Store Scraper Goat avatar

Shopify Store Scraper Goat

Pricing

Pay per usage

Go to Apify Store
Shopify Store Scraper Goat

Shopify Store Scraper Goat

Scrape products from any Shopify storefront without a login or API key. Pull an entire store catalog, a single collection, or one product. Walks pagination up to your chosen limit and returns clean, normalized product data with prices, variants, images, and tags.

Pricing

Pay per usage

Rating

5.0

(1)

Developer

Goutam Soni

Goutam Soni

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Share

Shopify Store Scraper

Extract products from any Shopify store without a login or API key. Pull an entire store catalog, every product in a single collection, or one specific product, and get clean, normalized product data with prices, variants, images, options, and tags. The scraper walks pagination automatically up to the limit you set, so you can collect hundreds or thousands of products from a store in a single run.

What it does

  • Full store catalog - every product in a Shopify store, with prices, variants, images, options, and tags.
  • Collection scrape - all products inside a single Shopify collection.
  • Single product - one product by its page URL or handle.
  • Automatic pagination - store and collection modes walk multiple pages until your maxProducts is reached or the catalog runs out.
  • Normalized output - one stable, importance-ordered shape per product, so every field is always present and easy to map.
  • No login, no API key - point it at a public storefront URL and run.
  • Bulk and concurrent - scrape many stores or collections in parallel in one run.

Use cases

  • Competitor price monitoring - track a market's pricing and assortment on a schedule and spot changes over time.
  • Product research and catalog feeds - build a clean product dataset for search, comparison, or analytics.
  • New-arrival and restock alerts - watch a store for fresh products, restocks, and price drops.
  • Lead generation - collect vendor names and product ranges across a niche for outreach.
  • Catalog audits - find products with missing images, empty descriptions, or out-of-stock variants.

Input

FieldTypeDescription
storeUrlsarrayStore domains or URLs to pull the full product catalog from. Bare domain or full URL, with or without https. Example: https://example.com, example.com.
collectionUrlsarrayCollection page URLs to pull products from a single collection. Provide the full collection URL. Example: https://example.com/collections/example-collection.
productUrlsarrayProduct page URLs to fetch one product each. Example: https://example.com/products/example-product.
maxProductsintegerCap on products returned per store or collection. Default 1000. Pagination is walked until this is reached or the catalog is exhausted.
concurrencyintegerHow many sources to process in parallel. Default 5.
proxyConfigobjectApify proxy configuration. RESIDENTIAL is the default and recommended option for the most reliable results.

At least one of storeUrls, collectionUrls, or productUrls is required.

Example input

{
"storeUrls": ["https://example.com", "another-example.com"],
"collectionUrls": ["https://example.com/collections/example-collection"],
"maxProducts": 2000,
"concurrency": 3,
"proxyConfig": { "useApifyProxy": true, "apifyProxyGroups": ["RESIDENTIAL"] }
}

Output

Each dataset item is one product with a clean, importance-ordered shape: identity first, then metrics, then content, then media, then metadata.

{
"storeDomain": "example.com",
"id": "7222392750160",
"handle": "example-product",
"url": "https://example.com/products/example-product",
"title": "Example Product",
"vendor": "Acme Co",
"productType": "Shoes",
"priceMin": 37.0,
"priceMax": 75.0,
"compareAtPriceMax": 95.0,
"available": true,
"variantsCount": 12,
"imagesCount": 5,
"optionsCount": 1,
"description": "Plain-text product description with HTML stripped out.",
"tags": ["sale", "new-arrival"],
"options": [{ "name": "Size", "values": ["S", "M", "L"] }],
"variants": [
{
"id": "41360177758288",
"title": "S",
"sku": "EX-001",
"price": 37.0,
"compareAtPrice": 75.0,
"available": false,
"option1": "S",
"option2": null,
"option3": null,
"grams": 433,
"requiresShipping": true,
"taxable": true
}
],
"featuredImage": "https://example.com/cdn/example.jpg",
"images": ["https://example.com/cdn/example.jpg"],
"createdAt": "2025-09-24T23:41:55.000Z",
"updatedAt": "2026-06-16T19:07:22.000Z",
"publishedAt": "2026-06-16T18:33:09.000Z"
}

Key fields

  • url, handle, id - stable identifiers for joining or deduping.
  • priceMin / priceMax - the price range across all variants. compareAtPriceMax is the highest compare-at price and is present only for products that are on sale.
  • available - true when at least one variant is in stock.
  • variants - per-variant SKU, price, stock, and option values.
  • updatedAt / publishedAt - useful for change detection and new-arrival monitoring.

Every field is always present. Values that the store does not publish for a given product are returned as null.

FAQ

Is this scraper free? How is it priced? You pay per product returned, plus standard Apify platform usage. There is no per-run start fee. Check the pricing tab on the actor's Store page for the current rate.

Do I need a login, password, or API key? No. The scraper reads public storefront data only. Just provide a store, collection, or product URL.

How many products can I scrape per store? Set maxProducts to whatever you need. The scraper walks pagination across multiple pages automatically, so you can pull the full catalog (hundreds or thousands of products) in one run, bounded only by what the store exposes publicly.

How fast is it? Most stores return a few hundred products per second over a clean connection. Run multiple stores or collections in parallel with the concurrency setting.

Can I scrape many stores at once? Yes. Add multiple entries to storeUrls (and/or collectionUrls and productUrls) and they are processed concurrently in a single run.

Why are some fields null? A field is null only when the store itself does not publish that value for a product (for example, compareAtPriceMax is set only on discounted items, and productType is sometimes left blank). The output shape is always complete so your downstream mapping never breaks.

Notes

  • A run uses the Apify proxy you select. RESIDENTIAL gives the most reliable results.
  • If a source is temporarily unavailable, an item is returned with a generic status (upstream_unavailable, upstream_rate_limit, or not_found) so a single failure never stops the run.
  • The default input is a health-check sentinel that returns a single confirmation record. Replace storeUrls with real store URLs to scrape.
  • Pagination depth is bounded by what a store exposes for its public catalog.