Shopify Products Scraper avatar

Shopify Products Scraper

Pricing

$19.99/month + usage

Go to Apify Store
Shopify Products Scraper

Shopify Products Scraper

🛍️ Shopify Products Scraper extracts product data from any Shopify store — titles, prices, variants, SKUs, images, inventory, descriptions, tags & vendor. ⚡ Fast, scalable, bulk & export-ready (CSV/JSON). ✅ Perfect for catalog building, price tracking, competitive research & dropshipping.

Pricing

$19.99/month + usage

Rating

0.0

(0)

Developer

ScraperX

ScraperX

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

21 days ago

Last modified

Share

Shopify Products Scraper

The Shopify Products Scraper is a fast, reliable Shopify product data scraper that automatically discovers product pages on Shopify stores and extracts structured product data at scale. It solves the manual hassle of locating product URLs and exporting details by scanning for “/products/” links and fetching each product’s public JSON. Built for marketers, developers, data analysts, and researchers, this Shopify store scraper and Shopify products.json scraper helps you scrape Shopify products to CSV/JSON across multiple stores and turn catalogs into clean, analysis-ready datasets.

What data / output can you get?

Below are the exact fields saved to the Apify dataset for each product. Each record also preserves the full Shopify JSON payload for maximum flexibility.

Data fieldDescriptionExample value
store_urlBase URL of the Shopify store where the product was foundhttps://lootcrate.com
product_urlProduct page URLhttps://lootcrate.com/products/loot-crate
json_urlProduct JSON endpoint URLhttps://lootcrate.com/products/loot-crate.json
product_idProduct ID from Shopify JSON5083963261059
titleProduct titleLoot Crate
vendorVendor (brand) from Shopify JSONLoot Crate Core
product_typeProduct type categorySubscription Box
priceVariant price (from the first variant if present)29.99
compare_at_priceCompare-at price (from the first variant if present)24.99
tagsComma-separated tagsSubscription, Collectibles, Pop Culture
total_foundTotal number of products discovered for the store5
successfulRunning count of successfully extracted products for the store5
full_dataComplete product JSON object (as returned by the Shopify product endpoint){"product": { "id": 5083963261059, "title": "Loot Crate", ... }}

Notes:

  • The full_data field preserves the entire Shopify product response, including variants, images, descriptions, tags, and more. This enables Shopify variant and SKU scraping, Shopify product image scraping, and even inventory analysis from the preserved JSON.
  • Export the dataset in JSON or CSV directly from Apify.

Key features

  • 🔍 Automatic product discovery
    Scans store HTML for links containing “/products/” and deduplicates URLs—no need to list product pages manually. Works across multiple domains as a scalable Shopify store scraper.

  • 🧩 Full JSON preservation
    Saves the complete product payload in full_data so you can build a flexible Shopify product extractor for variants, SKUs, images, tags, and other metadata.

  • 🔄 Intelligent proxy fallback
    Starts with no proxy and automatically escalates to datacenter, then residential proxies upon 403/429 blocks—with sticky residential behavior and retries for reliability.

  • ⚡ High-throughput concurrency
    Fetches product .json endpoints asynchronously in parallel for faster runs—ideal for bulk Shopify product feed scraping.

  • 💾 Live data saving
    Pushes each product to the dataset as soon as it’s extracted to prevent data loss and support long-running jobs.

  • 🧑‍💻 Developer-friendly output
    Clean, consistent field names plus preserved full_data make it easy to integrate with APIs or power a Shopify product scraper Python workflow.

  • 🧭 Store-level summaries
    Saves an aggregated, store-grouped summary to the key-value store under “OUTPUT” with methods used, totals, and an array of { url, json } per product.

  • 🔐 No login or browser required
    Server-side fetching of public Shopify products.json endpoints—no cookies, sessions, or headless browser needed.

  • 🧱 Production-grade reliability
    Robust error handling, retry logic, Shopify detection, and detailed logs for stable operations across diverse stores.

How to use Shopify Products Scraper - step by step

  1. Sign in to the Apify Console and go to the Actors section.
  2. Open the “shopify-products-scraper” actor.
  3. Add input in the Input tab:
    • Provide one or more store homepages in startUrls (e.g., https://lootcrate.com).
    • Optionally configure proxyConfiguration (defaults to no proxy; automatic fallback is built in).
  4. Click Start to begin the run.
  5. Monitor live logs for progress:
    • Product discovery count per store
    • Proxy escalation (direct → datacenter → residential) if blocking occurs
    • Success/failure per product JSON fetch
  6. Access results in the Dataset tab as they stream in.
  7. Export the dataset to JSON or CSV for analysis, price tracking, catalog syncs, or further integration.

Pro tip: Use full_data to power a Shopify variant and SKU scraper for detailed pricing, images, and metadata in your internal pipelines.

Use cases

Use case nameDescription
E-commerce competitive trackingMonitor competitor catalogs and pricing across Shopify stores; export clean product snapshots for daily/weekly analysis.
Market research & trend analysisAggregate tags, product types, and vendor distributions to study assortment strategies and product trends.
Catalog integration & syndicationFeed full_data into your PIM/ETL to populate listings with titles, variants, images, and metadata.
Price monitoring & promo auditsTrack price vs compare_at_price to detect discounts and pricing changes across variants.
Data enrichment for analyticsEnrich BI dashboards with product IDs, vendors, and types for SKU-level insights.
Academic & research projectsCollect structured Shopify product datasets for studies on categories, pricing, or catalog design.
API pipeline for automationBuild automated workflows that scrape Shopify product feeds and export to JSON/CSV for downstream systems.

Why choose Shopify Products Scraper?

Built for precision, scale, and reliability, this tool automatically discovers product pages and extracts structured JSON for clean datasets.

  • 🥇 Precision-first extraction: Locates product URLs and fetches each product’s JSON for consistent, structured results.
  • 🚀 Built for scale: Handles multiple stores and parallel product requests—ideal for bulk Shopify product feed scraping.
  • 🧰 Developer-ready: Stable field names plus full_data make downstream mapping and API usage straightforward.
  • 🔒 Ethical by design: Targets publicly available product pages only; no login or private data access.
  • 💸 Cost-effective alternative: More stable and scalable than browser extensions or one-off tools.
  • 🔗 Workflow-friendly: Export to JSON/CSV and plug into analytics or catalog pipelines with minimal setup.
  • 🛡️ Proxy strategy that works: Direct → datacenter → residential fallback with retries ensures resilient runs when stores rate-limit or block.

In short: a production-ready Shopify product scraper tool that outperforms unstable extensions and rigid CSV-only downloaders.

Yes—when used responsibly. This actor collects data from publicly accessible Shopify store pages and product endpoints. It does not access authenticated areas or private accounts.

Guidelines for compliant use:

  • Only collect publicly available product data.
  • Respect websites’ terms of service and robots.txt guidance.
  • Be mindful of request rates; while fallback proxies and retries are in place, users control usage patterns.
  • Ensure compliance with applicable regulations (e.g., GDPR, CCPA) and consult your legal team for edge cases.

Input parameters & output format

Example JSON input

{
"startUrls": [
"https://lootcrate.com",
"https://www.decathlon.com"
],
"proxyConfiguration": {
"useApifyProxy": false
}
}

Parameters

  • startUrls (array, required): List one or more Shopify store URLs (e.g., https://lootcrate.com, https://www.decathlon.com). Supports bulk input.
    • Default: none
  • proxyConfiguration (object, optional): Choose which proxies to use. By default, no proxy is used. If requests are rejected/blocked, the actor automatically falls back to datacenter proxy, then residential proxy with retries.
    • Default: {"useApifyProxy": false}

Example dataset record (one item per product)

{
"store_url": "https://lootcrate.com",
"product_url": "https://lootcrate.com/products/loot-crate",
"json_url": "https://lootcrate.com/products/loot-crate.json",
"product_id": 5083963261059,
"title": "Loot Crate",
"vendor": "Loot Crate Core",
"product_type": "Subscription Box",
"price": "29.99",
"compare_at_price": "24.99",
"tags": "Subscription, Collectibles, Pop Culture",
"total_found": 5,
"successful": 5,
"full_data": {
"product": {
"id": 5083963261059,
"title": "Loot Crate",
"vendor": "Loot Crate Core",
"product_type": "Subscription Box",
"handle": "loot-crate",
"variants": [
{
"id": 34197535719555,
"price": "29.99",
"compare_at_price": "24.99",
"sku": "1010126US"
}
],
"images": [
{
"id": 123456789,
"src": "https://cdn.shopify.com/...",
"width": 2000,
"height": 2000"
}
],
"tags": "Subscription, Collectibles, Pop Culture"
}
}
}

Aggregated store summary (saved to key-value store under “OUTPUT”)

{
"https://lootcrate.com": {
"method": "shopify_api",
"total_found": 5,
"successful": 5,
"products": [
{
"url": "https://lootcrate.com/products/loot-crate",
"json": {
"product": {
"id": 5083963261059,
"title": "Loot Crate"
}
}
}
]
}
}

Notes:

  • price and compare_at_price are derived from the first variant if variants exist; if missing, values may be null.
  • The full_data object preserves the entire Shopify product JSON—use it to power a Shopify variant and SKU scraper, Shopify product image scraper, or Shopify product feed scraper tailored to your schema.

FAQ

Do I need to list individual product URLs?

No. The actor scans the store’s HTML for links containing “/products/” and automatically discovers product pages. This makes it a hands-off Shopify store scraper.

Does it work if a store blocks requests?

Yes. It starts with no proxy and automatically falls back to datacenter, then residential proxies on 403/429 responses. Once residential is used, it stays sticky and applies retries for reliability.

Can I scrape multiple Shopify stores in one run?

Yes. Add multiple store homepages in startUrls. The actor processes each store and streams product records to the dataset as they’re extracted.

What product data does it extract?

Each dataset item includes store_url, product_url, json_url, product_id, title, vendor, product_type, price, compare_at_price, tags, total_found, successful, and full_data (the complete Shopify product JSON). Use full_data for variants, images, descriptions, and other attributes.

How are results saved and exported?

Each product is pushed to the Apify dataset in real time. You can download results as JSON or CSV from the Dataset tab, or integrate the dataset into your pipeline.

Is login or a browser required?

No. This is a server-side Shopify product scraper tool that fetches public product JSON endpoints—no login or headless browser is required.

Can I use this with Python or an API?

Yes. The actor produces a standard Apify dataset and a store-grouped OUTPUT object in the key-value store. You can access both via the Apify API or wire them into a Shopify product scraper Python workflow.

What if a site isn’t a Shopify store?

The actor attempts to detect Shopify from the HTML. If not detected, it falls back to extracting “/products/” links from HTML and still requests each product’s .json endpoint where available.

Closing CTA / Final thoughts

The Shopify Products Scraper is built for accurate, scalable Shopify product extraction. It automatically discovers product pages, fetches each product’s JSON, and streams clean, structured records to your dataset.

Whether you’re a marketer, developer, data analyst, or researcher, you’ll capture reliable product IDs, titles, vendors, types, and variant pricing—plus the preserved full_data for deeper needs. Developers can consume the dataset via API or plug it into a Shopify product scraper Python automation pipeline.

Start extracting smarter product data today and transform Shopify catalogs into actionable insights at scale.