
Shopify Product & Collection Scraper
Pricing
Pay per event
Go to Apify Store

Shopify Product & Collection Scraper
Under maintenanceShopify Product & Collection Scraper is a powerful API that allows you to extract structured data from any public Shopify-powered store. Simply provide a product or collection page URL, and this API will return essential data including titles, prices, images, descriptions, variants and availability.
5.0 (1)
Pricing
Pay per event
0
3
3
Last modified
a day ago
An Apify actor that retrieves product and collection information from Shopify stores as JSON data.
Features
- Automatic Type Detection: Automatically detects whether the URL is a product or collection page
- Multiple Scraping Methods:
- First tries Shopify's JSON endpoints for clean data
- Falls back to HTML scraping with multiple selectors
- Comprehensive Data Extraction: Extracts titles, prices, images, variants, descriptions, and more
- Robust Error Handling: Multiple fallback methods ensure data extraction even when some methods fail
- Apify Integration: Properly structured as an Apify actor with input/output handling
Input Parameters
Parameter | Type | Required | Description |
---|---|---|---|
url | string | Yes | The Shopify URL to scrape (must contain /products/ or /collections/ ) |
type | string | No | Type of content to scrape: "auto" , "product" , or "collection" (default: "auto" ) |
Output Format
For Collections
{"id": "collection_id","title": "Collection Name","handle": "collection-handle","body_html": "Collection description","published_at": "2023-01-01T00:00:00Z","updated_at": "2023-01-01T00:00:00Z","sort_order": "manual","template_suffix": null,"products_count": 50,"products": [{"id": "product_id","title": "Product Name","handle": "product-handle","description": "Product description","images": ["https://example.com/image.jpg"],"url": "https://store.com/products/product-handle","price": "29.99","compare_at_price": "39.99","variants": [{"id": "variant_id","title": "Default Title","price": "29.99","compare_at_price": "39.99","available": true,"sku": "SKU123"}]}],"pagination": {"current_page": 1,"total_pages": 1,"total_products": 50,"products_on_page": 50}}
For Products
{"id": "product_id","title": "Product Name","handle": "product-handle","body_html": "Product description","images": ["https://example.com/image.jpg"],"variants": [{"id": "variant_id","title": "Default Title","price": "29.99","compare_at_price": "39.99","available": true,"sku": "SKU123"}],"price": "29.99","availability": "In Stock","sku": "SKU123","vendor": "Brand Name"}
Usage Examples
Local Development
- Install dependencies:
$npm install
- Create a
.env
file (optional for local testing):
SCRAPEDO_API_KEY=your_api_key_here
- Run the Apify actor:
$npm start
- Or test the standalone version:
$node standalone.js
Apify Platform
- Deploy to Apify platform
- Use the web interface to input parameters
- Or use the Apify API:
curl -X POST "https://api.apify.com/v2/acts/YOUR_ACTOR_ID/runs?token=YOUR_API_TOKEN" \-H "Content-Type: application/json" \-d '{"url": "https://store.myshopify.com/collections/featured","type": "collection"}'
Testing
Standalone Testing
The standalone.js
file provides a simple way to test the scraper without Apify:
$node standalone.js
This will:
- Test with a real Shopify store URL
- Save results to
scraping-results.json
- Show detailed logging of the scraping process
Custom Testing
You can modify test-input.json
to test different URLs:
{"url": "https://your-shopify-store.com/collections/featured","type": "collection"}
Supported Shopify Features
- JSON Endpoints: Automatically tries
*.json
endpoints for clean data - Product Collections: Extracts all products from collection pages
- Product Variants: Handles multiple product variants with different prices
- Images: Extracts product images with proper URL resolution
- Pricing: Handles both regular and sale prices
- SEO Data: Extracts product handles, titles, and descriptions
- Inventory: Tracks product availability and SKU information
Error Handling
The actor implements multiple fallback methods:
- Primary: Shopify JSON endpoints (
/products/*.json
,/collections/*.json
) - Secondary: Collection products API (
/collections/*/products.json
) - Fallback: HTML scraping with multiple CSS selectors
- Graceful Degradation: Returns partial data even if some methods fail
File Structure
apify-product-collection/├── src/│ └── main.js # Main Apify actor code├── apify.json # Apify configuration├── package.json # Dependencies and scripts├── standalone.js # Standalone testing version├── test-input.json # Test input for local development├── example.js # Example usage└── README.md # This file
Limitations
- Requires the Shopify store to be publicly accessible
- Some stores may block automated requests (403/404 errors are common)
- Complex product variants may not be fully captured via HTML scraping
- Pagination is limited to the current page for collection scraping
Troubleshooting
Common Issues
- 403/404 Errors: Many Shopify stores block automated requests
- No Products Found: Try different URLs or check if the store is accessible
- Apify Integration Issues: Make sure you're using the correct Apify SDK version
Debugging
- Check the console output for detailed error messages
- Use
standalone.js
for easier debugging - Modify
test-input.json
to test different URLs
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Test thoroughly
- Submit a pull request
License
ISC License
On this page
Share Actor: