Shopify Products Scraper
Pricing
$19.99/month + usage
Shopify Products Scraper
ποΈ Shopify Products Scraper extracts complete product data from any Shopify store β titles, prices, variants, SKUs, inventory, images, collections, tags & descriptions β at scale. β‘ Export JSON/CSV. π Ideal for market research, competitor analysis, feeds & catalog builds.
Pricing
$19.99/month + usage
Rating
0.0
(0)
Developer
ScrapeFlow
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
21 days ago
Last modified
Categories
Share
Shopify Products Scraper
Shopify Products Scraper is a fast, reliable Shopify product scraping tool that discovers product pages and extracts structured product data at scale. It solves the heavy lifting of Shopify product data scraping by automatically finding product URLs and fetching the complete Shopify product JSON for each item β ideal for marketers, developers, data analysts, and researchers. As a Shopify store product scraper and Shopify product data extractor, it helps you scrape Shopify products to CSV or JSON for analysis, feeds, and catalog builds with automation-ready workflows.
What data / output can you get?
Below are the exact fields saved to the Apify Dataset for each product. You can export results as JSON or CSV from the Apify Console.
| Data type | Description | Example value |
|---|---|---|
| store_url | The base Shopify store URL for the product | https://lootcrate.com |
| product_url | The product page URL | https://lootcrate.com/products/loot-crate |
| json_url | The productβs Shopify JSON endpoint | https://lootcrate.com/products/loot-crate.json |
| product_id | Product ID from Shopify JSON | 5083963261059 |
| title | Product title | Loot Crate |
| vendor | Product vendor | Loot Crate Core |
| product_type | Product type/category | Subscription Box |
| price | Price from the first variant (if any) | "29.99" |
| compare_at_price | Compare-at price from the first variant (if any) | "34.99" |
| tags | Comma-separated tags | Subscription, Collectibles, Pop Culture |
| total_found | Total products discovered for the store in this run | 125 |
| successful | Running count of successfully extracted products for the store | 120 |
| full_data | Full Shopify product JSON payload for maximum completeness | {"product": {"id": 5083963261059, "handle": "loot-crate", "variants": [...], "images": [...]}} |
Notes:
- Bonus field full_data preserves the complete Shopify product JSON (including variants, images, tags, timestamps, etc.), making this a robust Shopify product images downloader and automated Shopify product crawler for downstream processing.
- Export data as JSON or CSV from the Dataset UI.
Key features
-
π Automatic product discovery
Scans store HTML to find all links containing β/products/,β handles absolute/relative URLs, and deduplicates product links automatically β a streamlined way to scrape products from Shopify. -
π§ Shopify JSON API pagination
When Shopify is detected, the actor uses /products.json pagination to scrape all products from a Shopify store efficiently and build product URLs from handles. -
π Intelligent proxy fallback
Starts with no proxy for speed. If blocked (403/429), automatically falls back to datacenter and then residential proxies with retries. Sticky residential mode keeps runs stable on tougher stores. -
β‘ High-throughput concurrency
Asynchronous fetching of product .json endpoints with coordinated proxy escalation across tasks for reliable, fast throughput β great for Shopify product data scraping at scale. -
πΎ Live saving & resilience
Pushes each product to the Apify Dataset in real time to prevent data loss. Robust error handling and retry logic keep your runs on track. -
π§° Developer-friendly outputs
Simple, flat fields (store_url, product_url, json_url, product_idβ¦) plus full_data for maximum flexibility β perfect for pipelines, BI tools, and βScrape Shopify products to CSVβ use cases. -
π‘οΈ Production-ready reliability
Detailed logging, clear progress tracking, and proxy management deliver consistent results β a dependable Shopify products scraper software alternative to browser extensions.
How to use Shopify Products Scraper - step by step
- Create or log in to your Apify account and open Apify Console.
- Find the βshopify-products-scraperβ actor and open it.
- Enter inputs:
- Add one or more Shopify store URLs in startUrls (e.g., https://lootcrate.com).
- Optionally configure proxyConfiguration (default is no proxy; automatic fallback will kick in only if needed).
- Start the run by clicking Start.
- Monitor progress in the logs:
- Product discovery status
- Proxy fallback events (no proxy β datacenter β residential as needed)
- Success counters per store
- Access results:
- Go to the Dataset tab to view items as a table and export to JSON or CSV.
- A grouped summary is also saved to the Key-Value Store under key OUTPUT.
- Download your data or connect via the Apify API for automation.
Pro Tip: Trigger runs via the Apify API and pipe the Dataset results into your data warehouse or product feed workflows for a hands-off Shopify product scraper pipeline.
Use cases
| Use case name | Description |
|---|---|
| E-commerce intelligence & competitor tracking | Monitor catalogs, titles, tags, and product types across stores to benchmark positioning and assortment. |
| Price and promotion monitoring | Track price and compare_at_price over time to analyze discount strategies and margins. |
| Feed & catalog building | Build structured product feeds for marketplaces, ads, and comparison engines with full_data for flexible mapping. |
| Market research & trend analysis | Aggregate product metadata (tags, vendors, types) for cross-store insights and trend detection. |
| Storewide product crawl | Scrape all products from a Shopify store using /products.json pagination when available and product URL discovery. |
| Developer pipelines & APIs | Automate Shopify product data extraction and export via the Apify Dataset API for ETL and analytics jobs. |
| Image asset workflows | Use image URLs from full_data to enrich media pipelines and CDN asset checks. |
Why choose Shopify Products Scraper?
Delivering precision, automation, and reliability for real-world Shopify product data scraping workflows.
- π― Complete data fidelity: Keeps the full Shopify product JSON in full_data for maximum downstream flexibility.
- π Built for scale: Async, concurrent requests and automatic discovery make it ideal for bulk store runs.
- π§© Developer access: Clean flat fields plus full_data are easy to consume via the Apify Dataset API.
- π§± No browser extensions: More stable than extension-based scrapers and less overhead than heavy headless browsers.
- π§ͺ Proxy-aware resilience: Automatic fallback from no proxy to datacenter and residential ensures continuity on blocked stores.
- πΎ Real-time saving: Each product is written to the Dataset as soon as itβs fetched to prevent data loss.
- π Clear observability: Detailed logs and per-store summaries reduce troubleshooting time.
In short, itβs a reliable Shopify products scraper that outperforms unstable alternatives and delivers production-ready data for teams at any scale.
Is it legal / ethical to use Shopify Products Scraper?
Yes β when used responsibly. This actor fetches publicly available data from Shopify stores and does not access private or authenticated areas.
Guidelines for responsible use:
- Only collect public product data and respect each websiteβs terms.
- Be mindful of rate limits; the actor already includes retry and proxy logic to reduce strain.
- Comply with data protection laws such as GDPR and CCPA where applicable.
- Do not use the tool for harassment, unauthorized collection of private information, or any unlawful activities.
- Consult your legal team for edge cases or jurisdiction-specific requirements.
Input parameters & output format
Example JSON input
{"startUrls": ["https://lootcrate.com","https://www.decathlon.com"],"proxyConfiguration": {"useApifyProxy": false}}
Input parameters
- startUrls (array, required)
- Description: List one or more Shopify store URLs (e.g., https://lootcrate.com, https://www.decathlon.com). Supports bulk input.
- Default: ["https://lootcrate.com"] (prefill)
- proxyConfiguration (object, optional)
- Description: Choose which proxies to use. By default, no proxy is used. If the platform rejects or blocks the request, it will automatically fallback to datacenter proxy, then residential proxy with 3 retries.
- Default: {"useApifyProxy": false} (prefill)
Example JSON output (Dataset item)
{"store_url": "https://lootcrate.com","product_url": "https://lootcrate.com/products/loot-crate","json_url": "https://lootcrate.com/products/loot-crate.json","product_id": 5083963261059,"title": "Loot Crate","vendor": "Loot Crate Core","product_type": "Subscription Box","price": "29.99","compare_at_price": "34.99","tags": "Subscription, Collectibles, Pop Culture","total_found": 125,"successful": 120,"full_data": {"product": {"id": 5083963261059,"handle": "loot-crate","title": "Loot Crate","vendor": "Loot Crate Core","product_type": "Subscription Box","tags": "Subscription, Collectibles, Pop Culture","variants": [{"id": 34197535719555,"price": "29.99","compare_at_price": "34.99","sku": "1010126US","inventory_management": "shopify","requires_shipping": true}],"images": [{"id": 123456789,"src": "https://cdn.shopify.com/...","width": 2000,"height": 2000,"alt": "Product image"}]}}}
Additional output (Key-Value Store summary)
- The actor also writes a grouped summary to the default Key-Value Store under the key OUTPUT. Structure example:
{"https://lootcrate.com": {"method": "shopify_api","total_found": 125,"successful": 120,"products": [{"url": "https://lootcrate.com/products/loot-crate","json": {"product": {"id": 5083963261059,"handle": "loot-crate","title": "Loot Crate"}}}]}}
Notes:
- compare_at_price and price come from the first variant when available; some stores may return null for these values.
- full_data preserves the entire Shopify product payload for each item.
FAQ
How does the scraper discover product pages?
It scans the storeβs HTML for links containing β/products/β and builds a deduplicated list of product URLs. If the site is detected as Shopify, it also uses /products.json pagination to enumerate products via handles.
Can it scrape all products from a Shopify store?
Yes. When Shopify is detected, the actor paginates through /products.json to collect product handles and builds product URLs, enabling storewide coverage where the endpoint is accessible.
What happens if a store blocks requests?
The actor starts with no proxy for speed. If blocked (403/429), it automatically falls back to a datacenter proxy, and then to a residential proxy with retries. Once residential is active, it remains sticky for the rest of the run.
Can I scrape multiple stores in one run?
Yes. Add multiple entries to startUrls. The actor processes each store and writes per-product items to the Dataset, with a grouped summary saved to the Key-Value Store under OUTPUT.
What data is included in the output?
Each Dataset item includes store_url, product_url, json_url, product_id, title, vendor, product_type, price, compare_at_price, tags, total_found, successful, and full_data (the complete Shopify product JSON). This makes it suitable for Shopify product data scraping and building rich product catalogs.
Do I need a login or a Chrome extension?
No. This is an Apify actor β no Shopify login or browser extension is required. It operates as a server-side Shopify product scraper software with API-accessible results.
How do I export to CSV or JSON?
Open the runβs Dataset in Apify Console and choose your preferred format (JSON or CSV). This is ideal when you want to scrape Shopify products to CSV for BI tools or feed ingestion.
Can I limit the number of products scraped?
There is no user-facing limit parameter in the input. By default, the actor processes all products discovered via /products.json and/or link discovery.
Where are results saved?
Per-product records are stored in the Dataset for export. A store-level summary object is also saved in the Key-Value Store under the key OUTPUT.
Does it capture images and variants?
Yes. The full_data field contains the original Shopify product JSON, which includes variants, images, tags, and other metadata as provided by the store.
Closing CTA / Final thoughts
Shopify Products Scraper is built for fast, reliable, and structured Shopify product data extraction at scale. With automatic product discovery, resilient proxy fallback, and clean per-product outputs plus full_data, itβs ideal for marketers, developers, analysts, and researchers.
Run it in Apify to scrape all products from Shopify stores, export to CSV/JSON, and plug the Dataset into your analytics or feeds. Developers can automate via the Apify API for end-to-end pipelines. Start extracting smarter Shopify product data today and turn store catalogs into actionable insights.