Shopify Products Scraper Pro
Pricing
from $0.50 / 1,000 products
Shopify Products Scraper Pro
Extract product data from any Shopify store using official JSON API. Get products, variants, prices, inventory, images, and metadata. No authentication required. Fast, accurate, and cost-effective solution for e-commerce intelligence and competitor analysis.
Pricing
from $0.50 / 1,000 products
Rating
0.0
(0)
Developer

Normalize
Actor stats
0
Bookmarked
3
Total users
2
Monthly active users
3 days ago
Last modified
Categories
Share
Extract comprehensive product data from any Shopify store using the official Shopify JSON API. Fast, reliable, and cost-effective solution for e-commerce data extraction, competitor analysis, and market research.
What Does This Actor Do?
Shopify Products Scraper Pro extracts product information from Shopify stores without requiring authentication or API keys. It collects structured data including product details, variants, prices, inventory levels, images, and metadata - perfect for e-commerce intelligence, dropshipping, price monitoring, and market analysis.
Why Choose This Scraper?
This actor scrapes product information from Shopify stores without requiring authentication or API keys. It leverages Shopify's public JSON endpoints to extract structured data including products, variants, prices, inventory levels, images, and metadata.
Key Features:
- Uses official Shopify JSON API (not HTML scraping)
- Works on any public Shopify store
- No authentication required
- High accuracy and reliability
- Automatic pagination and retry logic
- Respectful rate limiting
Use Cases
E-commerce Intelligence:
- Competitor product analysis and pricing research
- Market trend identification and category analysis
- Product catalog monitoring and updates
Business Operations:
- Dropshipping supplier inventory tracking
- Price comparison platform data collection
- Product database enrichment and synchronization
Market Research:
- Industry product trends analysis
- Vendor and brand comparison
- Seasonal catalog changes tracking
Input Configuration
Required Parameters
storeDomain (String)
- The Shopify store domain to scrape
- Example:
gymshark.comorstore.myshopify.com - Do not include
https://or paths
Optional Parameters
mode (String)
- Scraping mode:
all,collection, orhandles - Default:
all all: Scrape all products from the storecollection: Scrape products from a specific collectionhandles: Scrape specific products by handle
collectionHandle (String)
- Collection handle to scrape (required when mode is
collection) - Example:
mens,sale,new-arrivals - Find handle in collection URL:
/collections/HANDLE
productHandles (Array)
- Array of product handles or URLs (required when mode is
handles) - Example:
["product-handle", "https://store.com/products/product-handle"] - Can mix handles and full URLs
includeVariants (Boolean)
- Include detailed variant information
- Default:
true - Set to
falseto reduce output size
includeImages (Boolean)
- Include product image details
- Default:
true - Set to
falseto reduce output size
maxProducts (Integer)
- Maximum number of products to scrape
- Default: Unlimited
- Useful for testing or sampling
maxConcurrency (Integer)
- Number of concurrent requests
- Default:
5 - Range:
1to20 - Higher values = faster scraping but more resource usage
proxyConfiguration (Object)
- Apify proxy configuration
- Recommended for large-scale scraping
- Example:
{"useApifyProxy": true, "apifyProxyGroups": ["RESIDENTIAL"]}
Input Examples
Example 1: Scrape All Products
{"storeDomain": "gymshark.com","mode": "all"}
Example 2: Scrape Specific Collection
{"storeDomain": "gymshark.com","mode": "collection","collectionHandle": "mens"}
Example 3: Scrape Specific Products
{"storeDomain": "gymshark.com","mode": "handles","productHandles": ["legacy-tshirt","vital-seamless-leggings"]}
Example 4: Optimized for Speed
{"storeDomain": "bigstore.com","mode": "all","maxConcurrency": 15,"proxyConfiguration": {"useApifyProxy": true}}
Example 5: Testing Configuration
{"storeDomain": "gymshark.com","mode": "all","maxProducts": 50,"includeVariants": false,"includeImages": false}
Output Format
The actor outputs structured JSON data with comprehensive product information.
Product Data Structure
{"url": "https://store.com/products/product-handle","id": 1234567890,"title": "Product Name - Variant","handle": "product-handle","description": "<p>HTML description</p>","descriptionText": "Plain text description","vendor": "Brand Name","productType": "Category","tags": ["tag1", "tag2"],"price": 29.99,"priceMin": 29.99,"priceMax": 39.99,"priceVaries": true,"compareAtPrice": 49.99,"compareAtPriceMin": 49.99,"compareAtPriceMax": 49.99,"onSale": true,"available": true,"totalInventory": 500,"variantsCount": 8,"variants": [...],"options": [...],"imagesCount": 6,"images": [...],"featuredImage": "https://cdn.shopify.com/...","createdAt": "2024-01-01T00:00:00Z","updatedAt": "2024-11-12T10:00:00Z","publishedAt": "2024-01-01T12:00:00Z","scrapedAt": "2024-11-12T10:30:00Z"}
Variant Information
When includeVariants is true, each product includes detailed variant data:
"variants": [{"id": 9876543210,"title": "Small / Black","price": 29.99,"compareAtPrice": 49.99,"sku": "PROD-SKU-001","barcode": "123456789012","inventoryQuantity": 100,"available": true,"option1": "Small","option2": "Black","option3": null,"weight": 0.2,"weightUnit": "kg","requiresShipping": true,"taxable": true}]
Product Options
"options": [{"name": "Size","position": 1,"values": ["Small", "Medium", "Large", "XL"]},{"name": "Color","position": 2,"values": ["Black", "White", "Blue"]}]
Image Information
When includeImages is true:
"images": [{"id": 3333333333,"src": "https://cdn.shopify.com/s/files/1/xxxx/products/image.jpg","alt": "Product Image Description","width": 2048,"height": 2048,"position": 1}]
Performance
Speed:
- Average: 500-1000 products per minute
- Depends on store response time and concurrency settings
Accuracy:
- 100% data accuracy using official API
- No parsing errors or missing fields
Reliability:
- Automatic retry on failures with exponential backoff
- Error handling for network issues and rate limits
- Success rate: 99%+
Resource Usage:
- Memory: Less than 512MB RAM for most jobs
- Compute: Approximately 0.01 compute units per 1,000 products
Pricing
Cost Estimate:
- Small store (100 products): ~$0.002
- Medium store (1,000 products): ~$0.02
- Large store (10,000 products): ~$0.20
- Enterprise (100,000 products): ~$2.00
Actual costs depend on compute time and proxy usage.
How It Works
This actor leverages Shopify's public JSON API endpoints available on all Shopify stores:
API Endpoints Used:
https://store.com/products.json- Product listing with paginationhttps://store.com/products/handle.json- Individual product detailshttps://store.com/collections/handle/products.json- Collection products
Process Flow:
- Domain Validation: Verifies the provided domain is a valid Shopify store
- Mode Selection: Routes to appropriate scraping strategy (all/collection/handles)
- Data Fetching: Makes requests to Shopify JSON endpoints with pagination
- Data Processing: Normalizes and enriches product data
- Output: Saves structured data to Apify dataset
Technical Advantages:
- No HTML parsing - direct JSON API access
- No CSS selectors that break with theme updates
- No authentication or API keys required
- Works on any Shopify store regardless of plan or theme
- Consistent data structure across all stores
Best Practices
For Large Stores (10,000+ products)
- Enable proxy configuration to avoid rate limiting
- Increase concurrency to 10-15 for faster scraping
- Consider scraping specific collections instead of entire store
- Use
maxProductsparameter for initial testing
For Regular Monitoring
- Use
mode: "collection"for specific categories - Schedule runs during off-peak hours
- Store results in named datasets for comparison
- Set up webhooks for automated processing
For Data Quality
- Keep
includeVariants: truefor complete inventory data - Enable
includeImages: truefor product catalogs - Use product handles for precise targeting
- Verify store domain before large scraping jobs
Troubleshooting
Store Not Found Error
Issue: "Domain does not appear to be a Shopify store"
Solutions:
- Verify the domain is correct (no typos)
- Remove
https://and paths from domain - Try without
www.prefix - Ensure the store is publicly accessible (not password-protected)
No Products Returned
Issue: Actor completes but returns empty dataset
Solutions:
- Verify the store has published products
- Check if collection handle is correct (try
mode: "all"first) - Ensure products are not restricted by location/password
- Check actor logs for specific error messages
Slow Performance
Issue: Actor takes longer than expected
Solutions:
- Increase
maxConcurrency(up to 20) - Enable Apify proxy configuration
- Reduce output size with
includeVariants: false - Check if store has slow response times
Incomplete Data
Issue: Some products missing fields
Solutions:
- Some Shopify stores may not populate all fields
- Check if
includeVariantsandincludeImagesare enabled - Verify the store's product data in Shopify admin
- Review actor logs for parsing warnings
Limitations
Technical Limitations:
- Only scrapes publicly accessible stores
- Cannot access password-protected stores or products
- Cannot bypass Shopify Plus wholesale portals
- Limited by Shopify's public API availability
Data Limitations:
- Cannot access customer data or order information
- Cannot retrieve draft or unpublished products
- Cannot access admin-only product metadata
- Inventory counts may be cached by Shopify
Rate Limiting:
- Respects Shopify's fair use guidelines
- Implements polite crawling (1-2 requests/second)
- Automatic backoff on rate limit responses
- Proxy usage recommended for very large stores
FAQ
Q: Does this work on all Shopify stores?
A: Yes, it works on any public Shopify store including custom domains and *.myshopify.com stores.
Q: Do I need API credentials or store access?
A: No authentication required. This uses public JSON endpoints available on all Shopify stores.
Q: Will I get blocked or rate limited?
A: The actor implements polite crawling with automatic retries. For large-scale scraping, use Apify proxies.
Q: How accurate is the data compared to HTML scraping?
A: 100% accurate. Using official API eliminates parsing errors common with HTML scraping.
Q: Can I scrape product reviews or customer data?
A: No, this actor only accesses publicly available product catalog data.
Q: How do I find collection handles?
A: Visit the collection page in your browser. The handle is in the URL: https://store.com/collections/HANDLE
Q: Can I scrape multiple stores in one run?
A: No, configure one store per actor run. Use Apify tasks or schedules for multiple stores.
Q: What happens if a product is deleted during scraping?
A: The actor handles 404 errors gracefully and continues with remaining products.
Related Actors
Explore our complete Shopify scraping suite:
- Shopify Price Monitor - Track price changes and sales over time
- Shopify Inventory Tracker - Monitor stock levels and availability
- Shopify Store Analyzer - Extract store metadata and analytics
- Shopify Collection Scraper - Specialized collection-based extraction
- Shopify Feed Generator - Generate product feeds for Google Shopping
Legal Compliance
This actor accesses only publicly available data from Shopify stores through official public API endpoints. It does not:
- Require authentication or API keys
- Circumvent access controls or security measures
- Access password-protected or restricted content
- Violate Shopify's Terms of Service
The actor implements responsible scraping practices including rate limiting and respectful request patterns. Users are responsible for ensuring their use complies with applicable laws, data protection regulations, and the terms of service of stores they scrape.