ποΈ Shopify Product Scraper
Pricing
Pay per usage
ποΈ Shopify Product Scraper
Extract product data from any Shopify-powered store instantly. This universal tool is lightweight and optimized for speed, gathering prices, variants, and images with ease. To ensure maximum stability and avoid IP blocking, using residential proxies is highly recommended.
Pricing
Pay per usage
Rating
0.0
(0)
Developer

Shahid Irfan
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
an hour ago
Last modified
Categories
Share
Shopify Product Scraper
Extract comprehensive product data from any Shopify store with high speed and reliability. This scraper supports multiple extraction methods including JSON API (recommended), JSON-LD structured data, and HTML parsing to ensure maximum compatibility across all Shopify stores.
Why Choose This Shopify Scraper?
- Multiple Extraction Methods - Automatically selects the best method: JSON API (fastest), JSON-LD, or HTML parsing
- Complete Product Data - Extracts titles, prices, variants, images, descriptions, SKUs, inventory, and more
- Smart Pagination - Automatically handles pagination to scrape entire collections
- Variant Support - Captures all product variants (sizes, colors, styles) with individual pricing
- Customizable Filtering - Control stock status, collection targeting, and result limits
- High Performance - Optimized for speed without compromising data quality
- Proxy Support - Built-in proxy rotation to avoid rate limiting
Features
- Scrape products from any Shopify store
- Extract data from specific collections or search results
- Support for product variants (sizes, colors, options)
- Automatic pagination handling
- JSON API priority for maximum speed
- HTML parsing fallback for compatibility
- Structured data extraction (JSON-LD)
- Filter by stock availability
- Customizable result limits
- Proxy configuration support
Input Configuration
The scraper accepts the following input parameters:
Required Parameters
| Parameter | Type | Description |
|---|---|---|
shopUrl | String | The base URL of the Shopify store (e.g., https://www.allbirds.com or allbirds.com) |
Optional Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
startUrls | Array | - | Specific URLs to scrape (collection pages, product pages). Overrides shopUrl if provided. |
collection | String | "all" | Collection handle to scrape (e.g., "mens-shoes", "new-arrivals"). Use "all" for all products. |
searchQuery | String | - | Search for products matching this query instead of scraping a collection. |
maxProducts | Integer | 100 | Maximum number of products to scrape. Set to 0 or leave empty for unlimited. |
maxPages | Integer | 999 | Maximum number of pages to crawl. Safety limit to prevent infinite loops. |
includeVariants | Boolean | true | Include all product variants (sizes, colors, etc.) in the output. |
includeOutOfStock | Boolean | true | Include products that are currently out of stock. |
proxyConfiguration | Object | Residential | Proxy settings. Residential proxies recommended for best results. |
Input Examples
Example 1: Scrape All Products from a Store
{"shopUrl": "https://www.allbirds.com","collection": "all","maxProducts": 100}
Example 2: Scrape Specific Collection
{"shopUrl": "https://gymshark.com","collection": "mens-clothing","maxProducts": 50,"includeVariants": true}
Example 3: Search Query
{"shopUrl": "https://www.fashionnova.com","searchQuery": "black dress","maxProducts": 30,"includeOutOfStock": false}
Example 4: Multiple URLs
{"startUrls": ["https://store1.myshopify.com/collections/summer","https://store2.myshopify.com/collections/winter"],"maxProducts": 100}
Output Format
The scraper returns structured data in JSON format. Each product contains the following fields:
Output Schema
{"id": 1234567890,"title": "Wool Runner - Natural Grey","handle": "wool-runner-natural-grey","description": "Product description text...","vendor": "Allbirds","product_type": "Shoes","tags": ["sustainable", "comfortable", "casual"],"price": 98.00,"compare_at_price": 120.00,"currency": "USD","available": true,"inventory_quantity": 45,"sku": "WR-NG-10","barcode": "123456789012","weight": 500,"weight_unit": "g","images": ["https://cdn.shopify.com/s/files/1/image1.jpg","https://cdn.shopify.com/s/files/1/image2.jpg"],"variants": [{"id": 987654321,"title": "Size 10 / Natural Grey","option1": "10","option2": "Natural Grey","option3": null,"price": 98.00,"compare_at_price": 120.00,"sku": "WR-NG-10","available": true,"inventory_quantity": 15}],"url": "https://www.allbirds.com/products/wool-runner-natural-grey","created_at": "2023-01-15T10:30:00Z","updated_at": "2024-12-18T08:20:00Z","published_at": "2023-01-20T09:00:00Z"}
Key Output Fields
| Field | Type | Description |
|---|---|---|
id | Integer | Unique product identifier from Shopify |
title | String | Product name/title |
handle | String | URL-friendly product identifier |
description | String | Product description (may include HTML) |
vendor | String | Brand or manufacturer name |
product_type | String | Product category or type |
tags | Array | Product tags for categorization |
price | Number | Current product price |
compare_at_price | Number | Original price (before discount) |
currency | String | Price currency code (USD, EUR, etc.) |
available | Boolean | Whether product is in stock |
inventory_quantity | Integer | Available stock quantity |
sku | String | Stock keeping unit identifier |
images | Array | Array of product image URLs |
variants | Array | All product variants (sizes, colors, etc.) |
url | String | Direct link to product page |
Use Cases
This Shopify scraper is perfect for:
- Price Monitoring - Track competitor pricing and detect price changes
- Market Research - Analyze product catalogs and market trends
- Inventory Tracking - Monitor stock levels and availability
- Product Database - Build comprehensive product databases for comparison sites
- Competitive Analysis - Analyze competitor product offerings and pricing strategies
- Dropshipping - Find products for dropshipping businesses
- Data Analytics - Gather data for business intelligence and analytics
- SEO Analysis - Study product descriptions and metadata
How It Works
The scraper uses an intelligent multi-method approach:
- JSON API Method (Primary)
Attempts to fetch product data directly from Shopify's JSON API endpoints (
/products.json,/collections/{handle}/products.json). This is the fastest and most reliable method. - JSON-LD Extraction (Secondary)
If JSON API is unavailable, extracts structured data from JSON-LD schema markup embedded in HTML pages.
- HTML Parsing (Fallback)
As a last resort, parses product information directly from HTML using intelligent selectors that work across different Shopify themes.
Performance and Limits
| Metric | Value |
|---|---|
| Average Speed | 50-200 products per minute (depends on method and store) |
| Recommended Memory | 2048 MB |
| Timeout | 3600 seconds (1 hour) |
| Max Concurrency | 5 requests |
| Retry Attempts | 3 per request |
Best Practices
- Automatic Method Selection - The scraper automatically chooses the fastest available method (JSON API β JSON-LD β HTML parsing)
- Enable Proxies - Always use residential proxies for large-scale scraping
- Set Reasonable Limits - Use
maxProductsto control cost and runtime - Include Variants - Set
includeVariants: truefor complete product data - Handle Rate Limits - Use proxy rotation to avoid rate limiting
- Test First - Start with small
maxProductsvalues to test configuration
Troubleshooting
Common Issues and Solutions
Data Export Options
Export your scraped data in multiple formats:
- JSON - Structured data format, ideal for APIs and applications
- CSV - Spreadsheet format, perfect for Excel and data analysis
- Excel - Native Excel format with formatting preserved
- HTML - Human-readable table format
- XML - Structured markup format
- RSS - Feed format for automated monitoring
Integration Options
Integrate the scraper with other tools and platforms:
- Schedule regular runs using Apify Scheduler
- Connect to Google Sheets for automatic data updates
- Integrate with webhooks for real-time notifications
- Use Apify API for programmatic access
- Connect to Zapier, Make, or other automation platforms
- Export to databases (PostgreSQL, MongoDB, etc.)
Cost Optimization Tips
- Use
maxProductsto limit the number of results - Set
includeOutOfStock: falseto skip unavailable products - Use datacenter proxies for stores without strict rate limits (cheaper than residential)
- Schedule scraping during off-peak hours
- Leverage the dataset cache to avoid re-scraping recent data
Privacy and Ethics
- This scraper only collects publicly available product information
- Always respect website terms of service
- Use reasonable rate limits to avoid overloading target servers
- Only scrape data you have permission to access
- Comply with data protection regulations (GDPR, CCPA, etc.)
Support
Need help or have questions?
- Check the Apify Documentation
- Visit the Apify Discord Community
- Contact support through the Apify Console
- Review the input examples above for common use cases
Updates and Changelog
The scraper is regularly updated to:
- Maintain compatibility with Shopify platform changes
- Improve performance and reliability
- Add new features based on user feedback
- Fix bugs and resolve issues
- Enhance data extraction accuracy
Technical Requirements
- Node.js 22 or higher
- Apify platform account
- Proxy configuration (recommended for production use)
- Minimum 2048 MB memory allocation
Related Scrapers
- Shopify Store Finder
- Shopify Collection Scraper
- E-commerce Price Monitor
- Product Review Scraper
Built with β€οΈ for the Apify community
Happy scraping!