Shopify Product Scraper - Prices, Variants & Stock
Pricing
from $4.00 / 1,000 product scrapeds
Shopify Product Scraper - Prices, Variants & Stock
Shopify product scraper for public stores: export products, prices, variants, SKUs, images, collections, stock signals, discounts, and price-drop alerts. Built for catalog exports, competitor tracking, and scheduled storefront monitoring.
Pricing
from $4.00 / 1,000 product scrapeds
Rating
0.0
(0)
Developer
Nick
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
16 hours ago
Last modified
Categories
Share
Use this Shopify product scraper to export public Shopify store catalogs with products, prices, variants, SKUs, images, collections, stock signals, discounts, and price-drop alerts. It is built for clean product datasets first, then scheduled competitor monitoring once your export shape is proven.
Best first run
{"storeUrls": ["https://allbirds.com"],"maxProducts": 50,"includeVariants": true,"includeImages": true}
Use this actor when you need a Shopify product scraper, Shopify price scraper, Shopify variant exporter, or Shopify price drop monitor. Start with one store, then schedule watch mode once the output matches your workflow. Product exports stay clean in the default dataset; collections, alert rollups, AI analysis, and diagnostics are separated into their own datasets.
What you get back
- A clean product dataset with normalized
url,title,vendor,product_type,currency, price ranges, variants, SKUs, image URLs, tags, availability, and discount fields. - Separate
collections,alerts,analysis, anddiagnosticsdatasets so catalog exports are not mixed with rollups or operational records. - Good starter run: one store, 50 products, variants and images on, collections off. Add collection graphs, watch mode, webhooks, or AI after the base product export looks right.
Scrape product catalogs from public Shopify stores and export titles, prices, variants, SKUs, stock signals, images, discounts, and collection graphs in seconds. Optional price-watch mode tracks price changes across runs and fires Slack/Zapier webhook alerts when products drop. AI-powered catalog analysis delivers competitive positioning and pricing strategy insights on demand.
Shopify Product Scraper
This actor pulls product catalog data from public Shopify-powered stores and returns structured product rows with full variant, price, image, collection, and availability fields. No Shopify admin access or Shopify API key is required for public storefront data.
Each product comes back with full variant data including SKUs, prices, compare-at prices, inventory quantities, option values (size, color, material), and a derived discount_percent per variant. Prices are normalized from Shopify's string format to floats. The actor resolves each store's ISO currency code from /meta.json, the X-Shopify-Shop-Currency response header, or ccTLD inference, and emits it as a top-level currency field on every product.
Shopify Price Drop Monitor
Price-watch mode (watchMode: true) persists each product's price history across runs in a named Apify key-value store (60-snapshot FIFO, approximately 2 months at daily cadence). On every subsequent run the actor computes price_delta_pct, direction (up/down/flat), and first_seen_at for every product. Add alertWebhookUrl and you get automatic fire-and-forget JSON POSTs to Slack, Zapier, n8n, or Make the moment a product drops past your alertMinDropPct threshold. This replaces a custom backend that would otherwise require a database, a cron job, and a webhook server.
Collection scraping walks each store's /collections.json to enumerate categories and build a product-to-collection membership graph, complete with per-collection discount heatmaps. Discounted-only filtering lets you pull only sale items for arbitrage and reseller research. AI catalog analysis - via OpenRouter, Anthropic, Google AI, OpenAI, or Ollama - generates a structured report on pricing strategy, product gaps, and competitive positioning.
The actor supports multi-store runs, processing each store sequentially with polite 1-second delays between requests and exponential backoff on rate limits.
Features
- Public Shopify catalog data - exports product and collection data from public storefronts. No browser, no theme scraping, no fragile selectors that break on theme updates.
- Full product data - Titles, handles, descriptions (HTML stripped to plain text), vendors, product types, tags, creation and update dates.
- Variant extraction - Prices (as floats), compare-at prices, SKUs, inventory quantities, weight, option values (size, color, material), plus a derived
discount_percentper variant. - Image collection - All product images with dimensions and alt text, plus featured image identification.
- Derived analytics - Price ranges (min/max), average price, on-sale detection, variant counts,
has_variantsflag,available_variant_count(in-stock variants),min_weight_g/max_weight_gnormalized to grams,max_discount_percent,avg_discount_percent,compare_at_min_price,compare_at_max_price. - Storefront currency detection - Resolves each store's ISO 4217 currency code from
/meta.json, theX-Shopify-Shop-Currencyresponse header, or ccTLD fallback (28 country codes mapped). - Collections scraping - Optional
/collections.jsonscrape. Each collection (id, title, handle, description, products_count, image, URL) is emitted as a dataset item withtype="collection", ideal for category-navigation analysis. - Discounted-only filter - Set
discountedOnly: trueto receive only products with at least one variant on sale. Built for arbitrage, reseller research, and sale-intensity dashboards. Skipped products are not charged. - Product-to-collection graph + discount heatmap - Set
fetchCollectionProducts: trueto walk collection product pages and build a membership graph. Every product gains acollection_membershipsarray andcollection_count. Every collection gains adiscount_statsblock (product_count, on_sale_pct, avg/max discount percent, price range). No other Apify Shopify scraper exposes this graph. - Price-watch mode + drop-alert webhooks - Set
watchMode: trueto persist each product's min/max/avg price across runs. Every product receives aprice_watchblock with{price_previous, price_now, price_delta_pct, direction, price_drop, first_seen_at, history_sample_count}. AddalertWebhookUrlto POST JSON to Slack/Zapier/n8n/Make when a product drops by >=alertMinDropPct. No other Apify Shopify scraper ships cross-run price history plus webhooks. - Multi-store support - Scrape multiple Shopify stores in a single run.
- AI catalog analysis - LLM-powered analysis of pricing strategy, product gaps, competitive positioning, and actionable recommendations. When graph mode is on, adds
collection_discount_heatmapandmerchandising_signals. - Multi-LLM support - Choose between OpenRouter (recommended - 300+ models, cheapest), Anthropic (Claude), Google AI (Gemini), OpenAI (GPT), or Ollama (self-hosted).
- Pay-per-event pricing - Only pay for products actually scraped. Webhook failures are not charged.
Use Cases
Dropshippers and Product Researchers
Discover winning products across Shopify stores in your niche. Compare pricing, identify trending items, and find suppliers by scanning vendor data across multiple stores in a single run. Export product catalogs to CSV for side-by-side comparison with your own inventory.
Competitive Price Monitoring with Alerts
Run the actor on a daily schedule with watchMode: true and alertWebhookUrl pointing to a Slack channel. The first run builds a baseline. From the second run onward, any product that drops by your configured percentage fires an instant Slack notification with the product title, old price, new price, and percentage change. No custom backend required.
Market Research and Trend Analysis
Build datasets of products across an entire market segment. Analyze pricing distributions, popular product types, common tags, and vendor landscapes. Use the AI-powered catalog analysis to identify underserved categories and pricing gaps without reviewing products manually.
Sale and Arbitrage Research
Enable discountedOnly: true to pull only products currently on sale. The actor emits full variant-level discount percentages and compare-at prices, making it easy to identify reseller arbitrage opportunities, clearance liquidation, and deep-discount patterns.
E-commerce Data Aggregation
Aggregate product data from multiple Shopify stores into a single dataset for marketplace platforms, comparison shopping engines, or product discovery tools. The structured JSON output integrates directly with databases, APIs, and data pipelines. Currency normalization means all prices are labeled with their ISO code regardless of the store's country.
Inventory and Availability Tracking
Monitor variant-level inventory data to detect stock levels, out-of-stock products, and restocking patterns. Useful for affiliate marketers who need to promote in-stock items and for resellers timing purchasing decisions based on inventory signals.
Collection Category Analysis
Enable fetchCollectionProducts: true to build the full product-to-collection membership graph. This exposes which collections are deep-discount clearance zones versus full-price premium ranges - merchandising intelligence that normally requires manual store browsing or a Shopify partner account.
Input
| Parameter | Type | Default | Description |
|---|---|---|---|
storeUrls | array | required | List of Shopify store URLs to scrape (e.g. https://kith.com) |
maxProducts | integer | 100 | Max products per store (1-500) |
includeVariants | boolean | true | Include variant data (prices, SKUs, inventory, options) |
includeImages | boolean | true | Include product image URLs with dimensions |
includeCollections | boolean | false | Also scrape /collections.json for each store |
maxCollections | integer | 100 | Max collections per store (1-250) |
discountedOnly | boolean | false | Only emit products with at least one on-sale variant |
fetchCollectionProducts | boolean | false | Build product-to-collection graph + per-collection discount heatmap |
maxProductsPerCollection | integer | 250 | Max products walked per collection in graph mode (1-1000) |
watchMode | boolean | false | Track price changes across runs using a named KV store |
alertWebhookUrl | string | - | HTTPS webhook URL for price-drop alerts (auto-enables watchMode) |
alertMinDropPct | integer | 5 | Minimum price drop % to trigger a webhook alert |
enableAiAnalysis | boolean | false | Run AI-powered catalog analysis |
llmProvider | string | openrouter | LLM provider: openrouter, anthropic, google, openai, ollama |
llmModel | string | (auto) | Override the provider's default model |
openrouterApiKey | string | - | OpenRouter API key (get one free at openrouter.ai/keys) |
anthropicApiKey | string | - | Anthropic API key |
googleApiKey | string | - | Google AI (Gemini) API key |
openaiApiKey | string | - | OpenAI API key |
ollamaBaseUrl | string | http://localhost:11434 | Ollama base URL for self-hosted LLM |
proxyConfiguration | object | Apify Proxy | Proxy settings (datacenter is sufficient for most Shopify stores) |
CLI aliases accepted: shopUrl, storeUrl, url (single URL); shopUrls (array); maxItems (alias for maxProducts).
Output
The actor writes to multiple datasets so exports stay clean:
- Products Dataset - default buyer-facing product rows with variants, images, prices, stock signals, discounts, and optional
price_watchblocks. - Collections Dataset - collection rows from
/collections.json, including optionaldiscount_statswhen graph mode is enabled. - Alerts Dataset - price-drop webhook dispatch summaries from watch mode.
- AI Analysis Dataset - optional LLM catalog-analysis records.
- Diagnostics Dataset - no-result and operational diagnostic records.
Each product is output as a separate item in the Products Dataset.
Product item:
{"store_url": "https://hydrogen-preview.myshopify.com","product_id": 6730943955025,"title": "The Collection Snowboard: Hydrogen","handle": "the-collection-snowboard-hydrogen","vendor": "Hydrogen Vendor","product_type": "Snowboard","tags": ["Premium", "Snow", "Winter"],"description": "A high-performance snowboard designed for experienced riders.","created_at": "2023-01-01T00:00:00-05:00","updated_at": "2024-06-15T12:30:00-05:00","options": [{"name": "Size", "position": 1, "values": ["154cm", "158cm", "162cm"]}],"currency": "USD","min_price": 599.95,"max_price": 749.95,"average_price": 674.95,"on_sale": true,"max_discount_percent": 14.29,"avg_discount_percent": 14.29,"compare_at_min_price": 699.95,"compare_at_max_price": 749.95,"total_variants": 3,"has_variants": true,"available_variant_count": 2,"min_weight_g": 3200.0,"max_weight_g": 3500.0,"url": "https://hydrogen-preview.myshopify.com/products/the-collection-snowboard-hydrogen","scraped_at": "2026-04-22T10:30:00+00:00","variants": [{"id": 39888274382929,"title": "154cm","price": 599.95,"compare_at_price": 699.95,"discount_percent": 14.29,"sku": "SNOW-HYD-154","available": true,"inventory_quantity": 12,"weight": 3200.0,"weight_unit": "g","option1": "154cm","option2": null,"option3": null}],"images": [{"id": 28505021718609,"src": "https://cdn.shopify.com/s/files/1/0551/4566/0577/products/snowboard.jpg","width": 1200,"height": 800,"alt": "Hydrogen Snowboard"}],"featured_image": "https://cdn.shopify.com/s/files/1/0551/4566/0577/products/snowboard.jpg"}
Price-watch block (added to each product when watchMode: true):
{"price_watch": {"price_previous": 49.00,"price_now": 29.00,"price_delta_pct": -40.82,"direction": "down","price_drop": true,"first_seen_at": "2026-04-01T00:00:00+00:00","history_sample_count": 14}}
Collection item (when includeCollections: true, written to the Collections Dataset):
{"type": "collection","store_url": "https://hydrogen-preview.myshopify.com","collection_id": 387214442552,"title": "Backcountry Collection","handle": "backcountry","description": "Go further with our most technologically advanced boards.","products_count": 26,"url": "https://hydrogen-preview.myshopify.com/collections/backcountry","scraped_at": "2026-04-22T10:30:00+00:00"}
AI analysis item (when enableAiAnalysis: true, written to the AI Analysis Dataset):
{"type": "ai_analysis","store_url": "https://hydrogen-preview.myshopify.com","total_products_analyzed": 50,"analysis": {"store_overview": "A specialty winter sports retailer focused on premium snowboards.","pricing_strategy": "Premium pricing with selective discounting on older SKUs.","competitive_positioning": "Premium segment targeting experienced riders.","product_gaps": "No entry-level boards, limited accessories.","recommendations": ["Expand into beginner boards", "Add bundle pricing"]},"llm_provider": "openrouter","llm_model": "google/gemini-2.0-flash-001","analyzed_at": "2026-04-22T10:35:00+00:00"}
Pay-per-event pricing:
| Event | Price | Description |
|---|---|---|
product-scraped | $0.004 | Per product successfully extracted |
collection-scraped | $0.001 | Per collection extracted (when includeCollections=true) |
price-drop-detected | $0.005 | Per product whose price fell >= alertMinDropPct (watchMode only; first-run products not charged) |
alert-dispatched | $0.002 | Per price-drop webhook successfully delivered (2xx response) |
ai-analysis-completed | $0.05 | Per store catalog analyzed by LLM |
vs. commercial alternatives: Shopify Partner API requires store-owner approval per store; DataForSEO e-commerce endpoints charge $1.50/1,000 requests; Prisync charges $99/mo for competitor price tracking. This actor uses pay-per-event with no subscription - $0.004/product and zero monthly fees.
Quick Start
Scrape a single store:
{"storeUrls": ["https://hydrogen-preview.myshopify.com"]}
Scrape multiple stores with a product limit:
{"storeUrls": ["https://hydrogen-preview.myshopify.com","https://shop.example.com"],"maxProducts": 200,"includeVariants": true,"includeImages": true}
Sale-hunting mode - only discounted products:
{"storeUrls": ["https://shop.example.com"],"maxProducts": 200,"discountedOnly": true,"includeCollections": true,"maxCollections": 50}
Price-watch + drop alerts (schedule this daily):
{"storeUrls": ["https://shop.example.com"],"maxProducts": 200,"watchMode": true,"alertWebhookUrl": "https://hooks.slack.com/services/T000/B000/XXX","alertMinDropPct": 10}
The first run builds the history baseline - every product emits price_watch.direction: "new" and no alert fires. From the second run onward, the actor compares each product's current min_price against the prior snapshot and fires a webhook for every product that drops by at least 10%.
Webhook payload shape:
{"store_domain": "shop.example.com","product_id": 1234567890,"title": "Classic Tee","handle": "classic-tee","price_previous": 49.00,"price_now": 29.00,"price_delta_pct": -40.82,"direction": "down","currency": "USD","on_sale": true,"url": "https://shop.example.com/products/classic-tee","scraped_at": "2026-04-22T23:42:00+00:00"}
Works with Slack Incoming Webhooks, Zapier, Make/Integromat, n8n, or any HTTP endpoint that accepts JSON POSTs.
AI-powered catalog analysis:
{"storeUrls": ["https://hydrogen-preview.myshopify.com"],"maxProducts": 100,"enableAiAnalysis": true,"llmProvider": "openrouter","openrouterApiKey": "sk-or-v1-your-key-here"}
Collection graph + AI merchandising signals:
{"storeUrls": ["https://hydrogen-preview.myshopify.com"],"maxProducts": 200,"fetchCollectionProducts": true,"maxCollections": 30,"enableAiAnalysis": true,"llmProvider": "openrouter","openrouterApiKey": "sk-or-v1-your-key-here"}
MCP Quickstart - call this actor from Claude / Cursor / ChatGPT
Open Apify's hosted MCP configurator at mcp.apify.com, or install the Apify MCP server in your AI agent of choice:
# Claude Codeclaude mcp add apify -- npx -y @apify/actors-mcp-server --token YOUR_APIFY_TOKEN# Claude Desktop / Cursor (add to mcp.json):{"mcpServers":{"apify":{"command":"npx","args":["-y","@apify/actors-mcp-server","--token","YOUR_APIFY_TOKEN"]}}}
Then prompt the agent:
"Use the harvestlab/shopify-scraper actor on Apify to monitor these five Shopify stores for product price drops, new variants, and sold-out items. Send the changed products back as JSON for a Slack alert workflow."
Through Apify MCP, the agent will discover the actor's dataset_schema.json, generate the right input, run it, and pipe the typed output back into your conversation.
Use with AI agents
shopify-scraper output is agent-ready: typed product rows with variants, SKUs, inventory signals, currency, price_watch deltas, and a product-to-collection graph from public Shopify storefront data. DTC research, brand-protection, and arbitrage workflows get cross-store catalog exports and drop-alert intelligence at $0.004 per product instead of $99-299/mo Prisync or Visualping seats. Structured JSON pairs cleanly with alertWebhookUrl for n8n, Zapier, or Slack drop alerts.
LangChain tool wrapper
from langchain_core.tools import Toolfrom apify_client import ApifyClientclient = ApifyClient("YOUR_APIFY_TOKEN")def _track_shopify_store(args: dict) -> list:run = client.actor("harvestlab/shopify-scraper").call(run_input={"storeUrls": args["storeUrls"],"watchMode": args.get("watchMode", True),"alertWebhookUrl": args.get("alertWebhookUrl"),"alertMinDropPct": args.get("alertMinDropPct", 5),"fetchCollectionProducts": True,})return list(client.dataset(run["defaultDatasetId"]).iterate_items())track_shopify_store = Tool(name="track_shopify_store",description="Scrape Shopify products, variants, stock signals, price drops, and collection graph. Input: {storeUrls: [str], watchMode: bool, alertWebhookUrl: str, alertMinDropPct: int}.",func=_track_shopify_store,)# track_shopify_store.invoke({"storeUrls": ["https://hydrogen-preview.myshopify.com"], "watchMode": True, "alertWebhookUrl": "https://hooks.slack.com/services/T000/B000/XXX"})
LangGraph arbitrage / brand-protection node
from langgraph.graph import StateGraph, ENDfrom typing import TypedDictfrom apify_client import ApifyClientclient = ApifyClient("YOUR_APIFY_TOKEN")class StoreState(TypedDict):storeUrls: list[str]alertMinDropPct: intproducts: list[dict]drops: list[dict]def track_node(state: StoreState) -> StoreState:run = client.actor("harvestlab/shopify-scraper").call(run_input={"storeUrls": state["storeUrls"],"watchMode": True,"alertMinDropPct": state["alertMinDropPct"],"discountedOnly": False,})items = list(client.dataset(run["defaultDatasetId"]).iterate_items())drops = [i for i in items if (i.get("price_watch") or {}).get("price_drop")]return {**state, "products": items, "drops": drops}graph = StateGraph(StoreState)graph.add_node("track", track_node)graph.set_entry_point("track")graph.add_edge("track", END)# graph.compile().invoke({"storeUrls": ["https://shop.example.com"], "alertMinDropPct": 10, "products": [], "drops": []})
Troubleshooting
Store URL returns no product catalog
The target site is not a Shopify store, or the store owner has disabled public product catalog access. A 404 or HTML page means the site uses a different platform (WooCommerce, Magento, BigCommerce) or has explicitly restricted product export access. The actor logs the error and skips that store without failing the entire run.
Store is password-protected and returns no products
Shopify stores in development mode show a login page at /password before any other route. Product catalog access returns an empty response or a redirect in this state. There is no workaround - password-protected stores do not expose public product data. Wait until the store goes live or contact the store owner for access.
Product variants appear incomplete or prices show null
Shopify usually exposes published variant data for public products. If variant prices appear as null, the store may be running a headless Shopify setup with custom storefront behavior or restricted product data. Inventory quantities can also be hidden by the store owner via Shopify admin settings.
maxProducts cap hit but the store has more products
The actor's hard cap is 500 products per store per run. If the store has more than 500 products, use fetchCollectionProducts: true to walk collection-level product lists - large catalogs are often split across collections, each with fewer than 500 items.
Price-watch webhook not firing after a price change
Alert webhooks require all four conditions simultaneously: (1) watchMode: true, (2) a valid alertWebhookUrl returning a 2xx response, (3) the same store URL used across at least 2 runs so a prior baseline exists, and (4) the price delta meets or exceeds alertMinDropPct. The first run always seeds the baseline silently - no alert fires. If the webhook URL returns anything other than 2xx, the alert-dispatched event is not charged and the failure is logged.
AI analysis returns an error about missing API key
Provide the API key either in the actor input field (e.g. openrouterApiKey) or as an environment variable (OPENROUTER_API_KEY, ANTHROPIC_API_KEY, GOOGLE_API_KEY, or OPENAI_API_KEY). Set enableAiAnalysis: false to skip AI entirely - the product scrape completes normally without it.
No products scraped from any store
Check the run log for HTTP errors. Common causes: (1) the URLs are not Shopify stores, (2) the stores have restricted public product catalog access, (3) aggressive rate limiting - try enabling the Apify proxy in the proxy settings field. Datacenter proxies are sufficient for the vast majority of Shopify stores.
Frequently Asked Questions
Can I scrape products from any Shopify store? You can scrape public product data from most live Shopify storefronts. Password-protected stores, non-Shopify sites, and stores that restrict public product catalog access will return diagnostics instead of product rows.
Does it return variants and SKUs?
Yes. When includeVariants: true, each product includes variant prices, compare-at prices, SKUs, option values, availability, and stock signals when the store exposes them.
Can it monitor stock or inventory signals? Yes. The actor emits availability and variant-level inventory fields when they are public. Some store owners hide exact inventory quantities, so treat stock data as a signal rather than a guaranteed warehouse count.
Can it detect Shopify price drops?
Yes. Enable watchMode: true to persist price history across runs, then use alertWebhookUrl and alertMinDropPct to send Slack, Zapier, Make, or n8n alerts when products drop by your threshold.
Do I need a Shopify API key? No Shopify admin API key is needed for public storefront product data. AI analysis is optional and requires only the LLM provider key you choose.
Legal and Compliance
This actor accesses product data that public Shopify storefronts expose without authentication, but you are responsible for confirming that your use complies with the target store's terms and applicable laws.
User Responsibility. You are responsible for ensuring your use of this actor complies with all applicable laws and regulations, including data protection laws such as GDPR, CCPA, and other privacy regulations. If you collect data that may include personal information (e.g. vendor names that identify individuals), ensure you have a lawful basis for processing it.
Data Handling. Product data scraped by this actor is stored in your Apify dataset. You control the retention, export, and deletion of this data through the Apify platform. The price-watch KV store (shopify-price-history) persists across runs and can be manually deleted from your Apify account storage at any time.
Rate Limiting. The actor implements polite crawling with 1-second delays between requests and exponential backoff on rate limit responses (HTTP 429/403). This prevents excessive load on store servers.
Third-party review widgets (Yotpo, Judge.me, Stamped.io, Okendo) are not scraped. These platforms have separate Terms of Service and often require authentication.
Pair with amazon-scraper
Shopify Scraper covers direct-to-consumer Shopify stores; Amazon Scraper covers the marketplace side of the same SKUs. Run both on a daily schedule with watchMode: true to build a unified cross-channel price-history dataset: same product, two channels, side-by-side delta. Common workflows:
- Brand price-policing - detect when a Shopify brand's product is undercut on Amazon by a third-party seller. Pipe both datasets into a single Slack channel via
alertWebhookUrl. - Reseller arbitrage - pull
discountedOnly: truefrom Shopify, cross-reference Amazon Buy Box price, surface profitable flips. - MAP enforcement - manufacturers track Minimum Advertised Price compliance across both channels in one pipeline.
Both actors share the canonical url and currency output fields, so a join on product handle or SKU is one query.
Related Actors
- Amazon Scraper - Scrape product data, prices, reviews, and Buy Box across 10+ Amazon domains
- Website Contact Extractor - Extract emails, social profiles, and tech stack from any website
- Trustpilot Scraper - Scrape reviews and ratings from Trustpilot business profiles
- Bol.com Scraper - Scrape product data from the largest Dutch e-commerce marketplace
- Review Analyzer - AI-powered sentiment analysis for product and business reviews