
Shopify Product Scraper: Extract Product Data via JSON API
Pricing
$1.00 / 1,000 results
Go to Store

Shopify Product Scraper: Extract Product Data via JSON API
Effortlessly scrape comprehensive product data (titles, descriptions, prices, variants, images, SKUs, inventory & more) from any Shopify store. data extraction for e-commerce analysis, price monitoring, or building product feeds. Fast, reliable, and easy to configure with just the store URL.
0.0 (0)
Pricing
$1.00 / 1,000 results
0
Total users
1
Monthly users
1
Runs succeeded
>99%
Last modified
12 hours ago
Shopify Product Scraper Actor
This Apify actor is designed to scrape product data from a Shopify e-commerce website.
Input
The actor requires a single input field to specify the target Shopify store.
- Shopify Store URL (
startUrl
): This is the full base URL of the Shopify store you intend to scrape. It can be the.myshopify.com
domain or a custom domain pointing to the Shopify store.- Type:
string
- Required:
true
- Example:
https://your-shop-name.myshopify.com
orhttps://www.yourcustomdomain.com
- Type:
Example Input JSON:
{"startUrl": "[https://menswearstore.myshopify.com](https://menswearstore.myshopify.com)"}OutputThe actor will output a dataset where each item corresponds to a product found in the Shopify store. The data for each product includes a comprehensive set of details.Key fields in the output items include:productId: Shopify's unique identifier for the product.title: The name or title of the product.handle: The URL-friendly version of the product title, used in the product's direct URL.productUrl: The full direct URL to the product page on the Shopify store.vendor: The brand or vendor of the product.productType: The category or type of the product (e.g., "T-Shirt", "Shoes").createdAt: Timestamp indicating when the product was created in the Shopify system.updatedAt: Timestamp indicating the last update to the product information.publishedAt: Timestamp indicating when the product was made publicly visible on the store.tags: A list of tags associated with the product for organization or filtering.status: The current status of the product (e.g., active, archived, draft).descriptionHtml: The product description, typically including HTML formatting.options: A list of product options, such as "Size", "Color", "Material". Each option has a name and a list of possible values.variants: A list of all available variations of the product. Each variant can have its own:variantId: Unique ID for the variant.title: Title of the variant (e.g., "Small / Red").price: The selling price of the variant.sku: Stock Keeping Unit for the variant.available: Boolean indicating if the variant is in stock and available for purchase.option1, option2, option3: The specific option values for this variant (e.g., "Small", "Red").inventoryQuantity: The available stock quantity for the variant (if tracked).compareAtPrice: The original price before a sale, if applicable.images: A list of images associated with the product. Each image entry includes:imageId: Unique ID for the image.src: The URL of the image file.alt: The alternative text for the image (for accessibility and SEO).width, height: Dimensions of the image.position: The order of the image in the product gallery.variantIds: A list of variant IDs that this image is associated with.Example Output Item (simplified for brevity):{"productId": 1234567890123,"title": "Classic Cotton T-Shirt","productUrl": "[https://menswearstore.myshopify.com/products/classic-cotton-t-shirt](https://menswearstore.myshopify.com/products/classic-cotton-t-shirt)","vendor": "UrbanWear","productType": "Apparel","price": "24.99", // Often, a top-level price isn't in products.json; check variants"variants": [{"variantId": 9876543210987,"title": "Medium / Blue","price": "24.99","sku": "UWCCT-M-BLU","available": true,"inventoryQuantity": 50},{"variantId": 9876543210988,"title": "Large / Black","price": "24.99","sku": "UWCCT-L-BLK","available": false,"inventoryQuantity": 0}],"images": [{"imageId": 112233445566,"src": "[https://cdn.shopify.com/s/files/1/0000/0001/products/blue_tshirt.jpg?v=1620000000](https://cdn.shopify.com/s/files/1/0000/0001/products/blue_tshirt.jpg?v=1620000000)","alt": "Blue Cotton T-Shirt - Front View"}],"tags": ["cotton", "t-shirt", "summer collection"]}```**How it WorksInput URL Processing:**The actor takes the provided startUrl. It normalizes this URL to ensure it's a clean base URL for the Shopify store.Accessing products.json: It appends /product to the store's base URL. This endpoint is a standard Shopify feature that lists products in a JSON format.Pagination: The /product endpoint typically returns a limited number of products per request (e.g., up to 250). The actor handles pagination by making sequential requests, incrementing the page parameter (/products.json?limit=250&page=1, then page=2, and so on) until no more products are returned.Data Extraction: For each product retrieved from the JSON response, the actor extracts the relevant fields as detailed in the "Output" section.Data Storage: The extracted product data is then pushed to the Apify dataset associated with the actor run.Limitations and ConsiderationsEndpoint Accessibility: The primary dependency is the /product endpoint. Some Shopify store owners might customize their store or use security apps that restrict or disable access to this endpoint. If it's inaccessible, the actor will not be able to retrieve data using this method.Rate Limiting: Shopify, like most platforms, has rate limits to prevent abuse. If a store has an exceptionally large number of products, making too many requests in a short period could lead to temporary blocks (HTTP 429 errors). The current script does not have sophisticated retry logic for rate limiting but logs such occurrences.Data Completeness: While /product is comprehensive, it might not include every single piece of data visible on the live product pages (e.g., dynamically loaded reviews, highly customized product options not fitting the standard structure).Store-Specific Configurations: Stores with headless setups or very unique themes might behave differently.No Browser Emulation: This actor makes direct HTTP requests to the /product endpoint. It does not load web pages in a browser, so it won't execute JavaScript or capture data that is rendered client-side outside of the JSON endpoint.Usage InstructionsCreate an Actor: In your Apify account, create a new actor.Set Up Source Code:Navigate to the "Source" tab of your new actor.For the main Python script, copy the content of main.py provided.Create an INPUT_SCHEMA.json file in your actor's source files and paste the provided schema content into it. This defines the input UI in the Apify Console.Create a requirements.txt file and add the necessary Python packages (e.g., apify-client, httpx).Build the Actor: Once the source files are in place, build the actor. This process installs the dependencies and prepares the actor for running.Run the Actor:Navigate to the actor's page in the Apify Console.Click on "Run".In the input section, provide the "Shopify Store URL" for the store you wish to scrape.Start the run.Retrieve Data: Once the actor run is finished, you can find the scraped product data in the "Dataset" tab of the run. You can download it in various formats (JSON, CSV, Excel, etc.).This actor provides a straightforward way to gather product listings from Shopify stores that have the default /product endpoint enabled. For stores requiring more complex interaction (e.g., JavaScript rendering, login, or disabled JSON endpoints), a browser-based scraping solution (e.g., using Puppeteer or Playwright through Apify's `actors`) would be more appropriate.