Shopify Store Product Scraper avatar

Shopify Store Product Scraper

Pricing

Pay per usage

Go to Apify Store
Shopify Store Product Scraper

Shopify Store Product Scraper

Detect Shopify stores and extract all products with pricing, variants, images, inventory status, and more. Supports bulk scraping of multiple stores. Uses Shopify's public product API — fast and reliable.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Stephan Corbeil

Stephan Corbeil

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

21 hours ago

Last modified

Share

Extract product data from any Shopify store with ease. This actor detects Shopify stores and scrapes all products using the public /products.json API endpoint—no browser required, no login needed.

Features

  • Fast & Lightweight: Uses httpx for direct API access—no browser overhead
  • Shopify Detection: Automatically detects if a domain is a Shopify store
  • Complete Product Data: Extracts titles, descriptions, prices, variants, images, inventory, and more
  • Bulk Scraping: Process multiple stores in a single run
  • Smart Pagination: Handles stores with hundreds or thousands of products
  • Flexible Output: Include/exclude variants and images as needed
  • Error Handling: Gracefully handles blocked endpoints and non-Shopify sites
  • Pricing Extraction: Captures regular prices, compare-at prices, and price ranges

Use Cases

  • Competitor Price Monitoring: Track pricing changes across competing Shopify stores
  • Market Research: Analyze product catalogs, categories, and vendor distribution
  • Product Catalog Analysis: Map product offerings, variants, and inventory
  • Drop-shipping Research: Find suppliers and analyze product availability
  • SEO & Content Analysis: Extract product descriptions and metadata
  • Inventory Tracking: Monitor product availability across multiple stores

How to Use

Input Parameters

ParameterTypeDescriptionDefault
storeUrlstringSingle Shopify store URL (e.g., https://www.allbirds.com)-
storeUrlsarrayMultiple store URLs for bulk scraping-
includeVariantsbooleanInclude product variants (sizes, colors, etc.)true
includeImagesbooleanInclude product image URLstrue
maxProductsintegerMaximum products per store (0 = all)0

Example Input

{
"storeUrl": "https://www.allbirds.com",
"includeVariants": true,
"includeImages": true,
"maxProducts": 0
}

Or for bulk scraping:

{
"storeUrls": [
"https://www.allbirds.com",
"https://beardbrand.com",
"https://shop.gymshark.com"
],
"includeVariants": true,
"includeImages": true
}

Output Example

Each product is saved as a dataset item:

{
"id": 7234567890,
"title": "Classic Runner",
"handle": "classic-runner",
"description": "Lightweight and sustainable running shoes made from sugarcane-based foam.",
"vendor": "Allbirds",
"product_type": "Shoes",
"tags": ["womens", "running", "eco-friendly"],
"created_at": "2023-01-15T10:30:00Z",
"updated_at": "2024-03-20T14:22:00Z",
"published_at": "2023-01-15T10:30:00Z",
"price_range": {
"min": 79.95,
"max": 89.95,
"currency": "USD"
},
"compare_at_price_range": {
"min": 95.00,
"max": 105.00
},
"variants": [
{
"id": 112233445566,
"title": "Women's Size 5",
"sku": "CLSRUN-W5",
"price": 79.95,
"compare_at_price": 95.00,
"available": true,
"inventory_quantity": 42
},
{
"id": 112233445567,
"title": "Women's Size 6",
"sku": "CLSRUN-W6",
"price": 79.95,
"compare_at_price": 95.00,
"available": true,
"inventory_quantity": 38
}
],
"images": [
{
"src": "https://cdn.shopify.com/s/files/1/0001/2345/products/runner-1.jpg",
"alt": "Classic Runner - Front View"
},
{
"src": "https://cdn.shopify.com/s/files/1/0001/2345/products/runner-2.jpg",
"alt": "Classic Runner - Side View"
}
],
"store_url": "https://www.allbirds.com"
}

How It Works

  1. Shopify Detection: The actor checks for Shopify headers (x-shopify-stage, powered-by) and attempts to fetch /products.json
  2. Product Fetching: Uses the public API endpoint /products.json?limit=250&page={n} with pagination
  3. Data Extraction: Processes each product to extract pricing, variants, images, and metadata
  4. Smart Fallbacks: If /products.json is blocked, tries /collections/all/products.json
  5. Rate Limiting: Includes 200ms delays between requests to avoid overwhelming servers

Pricing

Based on PPE (Pay-Per-Event) model:

  • Actor Start: $0.01 per run
  • Product Scraped: $0.002 per product

Example Cost Calculation

Scraping 1,000 products from a Shopify store:

  • Actor start: $0.01
  • Products: 1,000 × $0.002 = $2.00
  • Total: $2.01

Limitations

  • Works only with Shopify stores using the public product API
  • Some Shopify stores may block the /products.json endpoint (typically redirected to login)
  • Product data is limited to what the public API exposes (some stores hide certain fields)
  • Private/password-protected stores are not supported
  • Image URLs are CDN-hosted and may change over time

Supported Shopify Store Types

  • Standard Shopify Plus stores
  • Shopify Basic, Professional, Advanced plans
  • Custom Shopify implementations
  • Stores using custom domains

Troubleshooting

"Not a Shopify store" error

  • Verify the domain is actually a Shopify store (check page source for powered-by: Shopify)
  • Ensure the URL is correct and accessible

No products found

  • The store may block public API access
  • Try accessing /products.json directly in your browser
  • Some stores require authentication for product data

Timeout errors

  • The store may be slow to respond
  • Try reducing maxProducts to fetch fewer items
  • Retry the run

API Reference

Shopify Public Product API

Endpoint: GET https://{domain}/products.json

Query parameters:

  • limit: Number of products per page (max 250)
  • page: Page number (starts at 1)
  • fields: Specific fields to return (optional)

Response: JSON object with products array

Data Quality

All product data comes directly from Shopify's public API—no parsing or inference. Data accuracy depends on how shop owners maintain their product information.

Support & Questions

For issues, questions, or feature requests, contact the NexGenData team.


Built with httpx. No browser required. Lightning-fast Shopify scraping.