Shopify Product Scraper - Products, Collections & Entire Stores avatar
Shopify Product Scraper - Products, Collections & Entire Stores

Pricing

$10.00/month + usage

Go to Apify Store
Shopify Product Scraper - Products, Collections & Entire Stores

Shopify Product Scraper - Products, Collections & Entire Stores

An advanced Shopify data extraction tool built for professionals. Simply enter any store, collection, or product URL — the scraper automatically detects Shopify stores, fetches structured product data via the Shopify JSON API, and handles pagination for large catalogs.

Pricing

$10.00/month + usage

Rating

5.0

(1)

Developer

Novus

Novus

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Share

Shopify Scraper

A professional-grade Shopify store scraper that extracts comprehensive product data from any Shopify-based e-commerce site. Supports full store crawling, collection scraping, individual product extraction, and search functionality.

Why Use This Scraper?

  • Market Research: Analyze competitor pricing, product descriptions, and variants
  • Trend Monitoring: Track new product launches and stock status changes
  • Data Aggregation: Build comprehensive catalogs from multiple Shopify stores
  • Marketing Insights: Understand how brands structure their product metadata and categories

Key Features

🚀 Store-Wide Crawling — Automatically discovers and extracts all products from an entire store

🎯 Precision Targeting — Scrape specific collections or individual product URLs

🔍 Search Support — Search for products within a store using keywords

💱 Price Normalization — Prices stored as integers (cents) to avoid floating-point errors

📦 Comprehensive Data — Extracts titles, descriptions, variants, images, options, SKUs, barcodes, and stock status

🛡️ Anti-Bot Resilience — Built-in rate limiting, retry logic, and proxy rotation

🔄 Auto-Detection — Automatically verifies if a site is Shopify-powered

Fast & Reliable — Optimized extraction with automatic fallback mechanisms

How It Works

  1. Automatic Shopify Detection — The scraper automatically verifies each URL is a Shopify store
  2. Smart URL Classification — Detects if URL is a store root, collection, product, or search page
  3. Data Extraction — Extracts comprehensive product data with automatic pagination
  4. Deduplication — Automatically removes duplicate products across collections

Input Parameters

ParameterTypeRequiredDefaultDescription
startUrlsArray-List of Shopify URLs (store, collection, or product)
maxProductsIntegerNo0Max products to scrape (0 = unlimited)
searchQueryStringNo-Search term for finding specific products
includeOutOfStockBooleanNotrueInclude out-of-stock products
currencyStringNoAutoCurrency code (USD, EUR, GBP, etc.)
proxyObjectNoApify ProxyProxy configuration
useHeadlessFallbackBooleanNotrueEnable fallback extraction method
requestTimeoutIntegerNo30000Request timeout in milliseconds
retryCountIntegerNo3Number of retry attempts

Input Examples

Scrape Entire Store

{
"startUrls": [
{ "url": "https://www.allbirds.com" }
],
"proxy": {
"useApifyProxy": true
}
}

Scrape Specific Collection

{
"startUrls": [
{ "url": "https://www.allbirds.com/collections/mens-shoes" }
],
"maxProducts": 100
}

Scrape Single Product

{
"startUrls": [
{ "url": "https://www.allbirds.com/products/mens-wool-runners" }
]
}

Search Within Store

{
"startUrls": [
{ "url": "https://www.allbirds.com" }
],
"searchQuery": "wool runners",
"maxProducts": 20
}

Multiple URLs

{
"startUrls": [
{ "url": "https://www.allbirds.com/collections/mens" },
{ "url": "https://www.allbirds.com/collections/womens" }
],
"maxProducts": 50
}

Output Schema

Each product includes:

{
"source": {
"id": "7654321098765",
"handle": "mens-wool-runners",
"url": "https://www.allbirds.com/products/mens-wool-runners",
"retailer": "www.allbirds.com",
"scrapedAt": "2025-12-13T10:30:00Z"
},
"title": "Men's Wool Runners",
"description": "Our original wool shoe...",
"descriptionHtml": "<p>Our original wool shoe...</p>",
"vendor": "Allbirds",
"productType": "Shoes",
"tags": ["mens", "shoes", "wool"],
"createdAt": "2025-01-15T00:00:00Z",
"updatedAt": "2025-12-10T00:00:00Z",
"publishedAt": "2025-01-15T08:00:00Z",
"variants": [
{
"id": "42345678901234",
"title": "8 / Natural Grey",
"sku": "WR-M-NG-8",
"barcode": "1234567890123",
"price": 11000,
"compareAtPrice": null,
"currency": "USD",
"available": true,
"inventoryQuantity": null,
"requiresShipping": true,
"weight": 0.5,
"weightUnit": "kg",
"option1": "8",
"option2": "Natural Grey",
"option3": null
}
],
"images": [
{
"id": "12345678901234",
"url": "https://cdn.shopify.com/s/files/...",
"alt": "Men's Wool Runners",
"width": 1200,
"height": 1500,
"position": 1
}
],
"options": [
{ "name": "Size", "position": 1, "values": ["7", "8", "9", "10", "11", "12"] },
{ "name": "Color", "position": 2, "values": ["Natural Grey", "Black", "Navy"] }
]
}

Extracted Data Fields

Product FieldsVariant FieldsMedia & Options
TitleSKUAll Images
Description (text & HTML)BarcodeImage Dimensions
Vendor/BrandPrice (in cents)Alt Text
Product TypeCompare-at PriceOptions (Size, Color)
TagsAvailabilityOption Values
Created/Updated DatesInventory QuantityPositions
URL & HandleWeight & Unit
RetailerShipping Required

URL Types Supported

URL PatternTypeExample
domain.comFull Storehttps://www.allbirds.com
/collections/{handle}Collectionhttps://www.allbirds.com/collections/mens
/products/{handle}Single Producthttps://www.allbirds.com/products/wool-runners
/search?q={query}Search Resultshttps://www.allbirds.com/search?q=wool

Troubleshooting

IssuePossible CauseSolution
0 ResultsSite is not Shopify-basedCheck logs - scraper auto-detects and warns
403 / Access DeniedIP flaggedEnable useApifyProxy with residential proxies
Incorrect PricesInteger formatDivide by 100 (e.g., 2995 → $29.95)
Missing ProductsRate limitingIncrease retryCount, use proxies

Important Notes

  • Price Format: All prices are integers in cents (e.g., 2995 = $29.95)
  • Variants: Each product contains all variants in the variants array
  • Deduplication: Products are automatically deduplicated by ID
  • Pagination: Handles pagination automatically across large catalogs

Cost Estimation

  • Speed: Typically 500-2,000 products per minute
  • Compute Units: ~0.1-0.2 CUs per 1,000 products
  • Proxy: Residential proxies recommended for best results

Actual costs vary based on store size and anti-bot measures.