Shopify Products Scraper avatar

Shopify Products Scraper

Pricing

$19.99/month + usage

Go to Apify Store
Shopify Products Scraper

Shopify Products Scraper

πŸ›οΈ Shopify Products Scraper extracts complete product data from any Shopify store β€” titles, prices, variants, SKUs, inventory, images, collections, tags & descriptions β€” at scale. ⚑ Export JSON/CSV. πŸ” Ideal for market research, competitor analysis, feeds & catalog builds.

Pricing

$19.99/month + usage

Rating

0.0

(0)

Developer

ScrapeFlow

ScrapeFlow

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

21 days ago

Last modified

Share

Shopify Products Scraper

Shopify Products Scraper is a fast, reliable Shopify product scraping tool that discovers product pages and extracts structured product data at scale. It solves the heavy lifting of Shopify product data scraping by automatically finding product URLs and fetching the complete Shopify product JSON for each item β€” ideal for marketers, developers, data analysts, and researchers. As a Shopify store product scraper and Shopify product data extractor, it helps you scrape Shopify products to CSV or JSON for analysis, feeds, and catalog builds with automation-ready workflows.

What data / output can you get?

Below are the exact fields saved to the Apify Dataset for each product. You can export results as JSON or CSV from the Apify Console.

Data typeDescriptionExample value
store_urlThe base Shopify store URL for the producthttps://lootcrate.com
product_urlThe product page URLhttps://lootcrate.com/products/loot-crate
json_urlThe product’s Shopify JSON endpointhttps://lootcrate.com/products/loot-crate.json
product_idProduct ID from Shopify JSON5083963261059
titleProduct titleLoot Crate
vendorProduct vendorLoot Crate Core
product_typeProduct type/categorySubscription Box
pricePrice from the first variant (if any)"29.99"
compare_at_priceCompare-at price from the first variant (if any)"34.99"
tagsComma-separated tagsSubscription, Collectibles, Pop Culture
total_foundTotal products discovered for the store in this run125
successfulRunning count of successfully extracted products for the store120
full_dataFull Shopify product JSON payload for maximum completeness{"product": {"id": 5083963261059, "handle": "loot-crate", "variants": [...], "images": [...]}}

Notes:

  • Bonus field full_data preserves the complete Shopify product JSON (including variants, images, tags, timestamps, etc.), making this a robust Shopify product images downloader and automated Shopify product crawler for downstream processing.
  • Export data as JSON or CSV from the Dataset UI.

Key features

  • πŸ” Automatic product discovery
    Scans store HTML to find all links containing β€œ/products/,” handles absolute/relative URLs, and deduplicates product links automatically β€” a streamlined way to scrape products from Shopify.

  • 🧠 Shopify JSON API pagination
    When Shopify is detected, the actor uses /products.json pagination to scrape all products from a Shopify store efficiently and build product URLs from handles.

  • πŸ” Intelligent proxy fallback
    Starts with no proxy for speed. If blocked (403/429), automatically falls back to datacenter and then residential proxies with retries. Sticky residential mode keeps runs stable on tougher stores.

  • ⚑ High-throughput concurrency
    Asynchronous fetching of product .json endpoints with coordinated proxy escalation across tasks for reliable, fast throughput β€” great for Shopify product data scraping at scale.

  • πŸ’Ύ Live saving & resilience
    Pushes each product to the Apify Dataset in real time to prevent data loss. Robust error handling and retry logic keep your runs on track.

  • 🧰 Developer-friendly outputs
    Simple, flat fields (store_url, product_url, json_url, product_id…) plus full_data for maximum flexibility β€” perfect for pipelines, BI tools, and β€œScrape Shopify products to CSV” use cases.

  • πŸ›‘οΈ Production-ready reliability
    Detailed logging, clear progress tracking, and proxy management deliver consistent results β€” a dependable Shopify products scraper software alternative to browser extensions.

How to use Shopify Products Scraper - step by step

  1. Create or log in to your Apify account and open Apify Console.
  2. Find the β€œshopify-products-scraper” actor and open it.
  3. Enter inputs:
    • Add one or more Shopify store URLs in startUrls (e.g., https://lootcrate.com).
    • Optionally configure proxyConfiguration (default is no proxy; automatic fallback will kick in only if needed).
  4. Start the run by clicking Start.
  5. Monitor progress in the logs:
    • Product discovery status
    • Proxy fallback events (no proxy β†’ datacenter β†’ residential as needed)
    • Success counters per store
  6. Access results:
    • Go to the Dataset tab to view items as a table and export to JSON or CSV.
    • A grouped summary is also saved to the Key-Value Store under key OUTPUT.
  7. Download your data or connect via the Apify API for automation.

Pro Tip: Trigger runs via the Apify API and pipe the Dataset results into your data warehouse or product feed workflows for a hands-off Shopify product scraper pipeline.

Use cases

Use case nameDescription
E-commerce intelligence & competitor trackingMonitor catalogs, titles, tags, and product types across stores to benchmark positioning and assortment.
Price and promotion monitoringTrack price and compare_at_price over time to analyze discount strategies and margins.
Feed & catalog buildingBuild structured product feeds for marketplaces, ads, and comparison engines with full_data for flexible mapping.
Market research & trend analysisAggregate product metadata (tags, vendors, types) for cross-store insights and trend detection.
Storewide product crawlScrape all products from a Shopify store using /products.json pagination when available and product URL discovery.
Developer pipelines & APIsAutomate Shopify product data extraction and export via the Apify Dataset API for ETL and analytics jobs.
Image asset workflowsUse image URLs from full_data to enrich media pipelines and CDN asset checks.

Why choose Shopify Products Scraper?

Delivering precision, automation, and reliability for real-world Shopify product data scraping workflows.

  • 🎯 Complete data fidelity: Keeps the full Shopify product JSON in full_data for maximum downstream flexibility.
  • πŸ“ˆ Built for scale: Async, concurrent requests and automatic discovery make it ideal for bulk store runs.
  • 🧩 Developer access: Clean flat fields plus full_data are easy to consume via the Apify Dataset API.
  • 🧱 No browser extensions: More stable than extension-based scrapers and less overhead than heavy headless browsers.
  • πŸ§ͺ Proxy-aware resilience: Automatic fallback from no proxy to datacenter and residential ensures continuity on blocked stores.
  • πŸ’Ύ Real-time saving: Each product is written to the Dataset as soon as it’s fetched to prevent data loss.
  • πŸ” Clear observability: Detailed logs and per-store summaries reduce troubleshooting time.

In short, it’s a reliable Shopify products scraper that outperforms unstable alternatives and delivers production-ready data for teams at any scale.

Yes β€” when used responsibly. This actor fetches publicly available data from Shopify stores and does not access private or authenticated areas.

Guidelines for responsible use:

  • Only collect public product data and respect each website’s terms.
  • Be mindful of rate limits; the actor already includes retry and proxy logic to reduce strain.
  • Comply with data protection laws such as GDPR and CCPA where applicable.
  • Do not use the tool for harassment, unauthorized collection of private information, or any unlawful activities.
  • Consult your legal team for edge cases or jurisdiction-specific requirements.

Input parameters & output format

Example JSON input

{
"startUrls": [
"https://lootcrate.com",
"https://www.decathlon.com"
],
"proxyConfiguration": {
"useApifyProxy": false
}
}

Input parameters

  • startUrls (array, required)
  • proxyConfiguration (object, optional)
    • Description: Choose which proxies to use. By default, no proxy is used. If the platform rejects or blocks the request, it will automatically fallback to datacenter proxy, then residential proxy with 3 retries.
    • Default: {"useApifyProxy": false} (prefill)

Example JSON output (Dataset item)

{
"store_url": "https://lootcrate.com",
"product_url": "https://lootcrate.com/products/loot-crate",
"json_url": "https://lootcrate.com/products/loot-crate.json",
"product_id": 5083963261059,
"title": "Loot Crate",
"vendor": "Loot Crate Core",
"product_type": "Subscription Box",
"price": "29.99",
"compare_at_price": "34.99",
"tags": "Subscription, Collectibles, Pop Culture",
"total_found": 125,
"successful": 120,
"full_data": {
"product": {
"id": 5083963261059,
"handle": "loot-crate",
"title": "Loot Crate",
"vendor": "Loot Crate Core",
"product_type": "Subscription Box",
"tags": "Subscription, Collectibles, Pop Culture",
"variants": [
{
"id": 34197535719555,
"price": "29.99",
"compare_at_price": "34.99",
"sku": "1010126US",
"inventory_management": "shopify",
"requires_shipping": true
}
],
"images": [
{
"id": 123456789,
"src": "https://cdn.shopify.com/...",
"width": 2000,
"height": 2000,
"alt": "Product image"
}
]
}
}
}

Additional output (Key-Value Store summary)

  • The actor also writes a grouped summary to the default Key-Value Store under the key OUTPUT. Structure example:
{
"https://lootcrate.com": {
"method": "shopify_api",
"total_found": 125,
"successful": 120,
"products": [
{
"url": "https://lootcrate.com/products/loot-crate",
"json": {
"product": {
"id": 5083963261059,
"handle": "loot-crate",
"title": "Loot Crate"
}
}
}
]
}
}

Notes:

  • compare_at_price and price come from the first variant when available; some stores may return null for these values.
  • full_data preserves the entire Shopify product payload for each item.

FAQ

How does the scraper discover product pages?

It scans the store’s HTML for links containing β€œ/products/” and builds a deduplicated list of product URLs. If the site is detected as Shopify, it also uses /products.json pagination to enumerate products via handles.

Can it scrape all products from a Shopify store?

Yes. When Shopify is detected, the actor paginates through /products.json to collect product handles and builds product URLs, enabling storewide coverage where the endpoint is accessible.

What happens if a store blocks requests?

The actor starts with no proxy for speed. If blocked (403/429), it automatically falls back to a datacenter proxy, and then to a residential proxy with retries. Once residential is active, it remains sticky for the rest of the run.

Can I scrape multiple stores in one run?

Yes. Add multiple entries to startUrls. The actor processes each store and writes per-product items to the Dataset, with a grouped summary saved to the Key-Value Store under OUTPUT.

What data is included in the output?

Each Dataset item includes store_url, product_url, json_url, product_id, title, vendor, product_type, price, compare_at_price, tags, total_found, successful, and full_data (the complete Shopify product JSON). This makes it suitable for Shopify product data scraping and building rich product catalogs.

Do I need a login or a Chrome extension?

No. This is an Apify actor β€” no Shopify login or browser extension is required. It operates as a server-side Shopify product scraper software with API-accessible results.

How do I export to CSV or JSON?

Open the run’s Dataset in Apify Console and choose your preferred format (JSON or CSV). This is ideal when you want to scrape Shopify products to CSV for BI tools or feed ingestion.

Can I limit the number of products scraped?

There is no user-facing limit parameter in the input. By default, the actor processes all products discovered via /products.json and/or link discovery.

Where are results saved?

Per-product records are stored in the Dataset for export. A store-level summary object is also saved in the Key-Value Store under the key OUTPUT.

Does it capture images and variants?

Yes. The full_data field contains the original Shopify product JSON, which includes variants, images, tags, and other metadata as provided by the store.

Closing CTA / Final thoughts

Shopify Products Scraper is built for fast, reliable, and structured Shopify product data extraction at scale. With automatic product discovery, resilient proxy fallback, and clean per-product outputs plus full_data, it’s ideal for marketers, developers, analysts, and researchers.

Run it in Apify to scrape all products from Shopify stores, export to CSV/JSON, and plug the Dataset into your analytics or feeds. Developers can automate via the Apify API for end-to-end pipelines. Start extracting smarter Shopify product data today and turn store catalogs into actionable insights.