Shopify Product Scraper ๐Ÿ›๏ธ avatar

Shopify Product Scraper ๐Ÿ›๏ธ

Pricing

Pay per usage

Go to Apify Store
Shopify Product Scraper ๐Ÿ›๏ธ

Shopify Product Scraper ๐Ÿ›๏ธ

Extract product data from any Shopify-powered store instantly. This universal tool is lightweight and optimized for speed, gathering prices, variants, and images with ease. To ensure maximum stability and avoid IP blocking, using residential proxies is highly recommended.

Pricing

Pay per usage

Rating

4.3

(3)

Developer

Shahid Irfan

Shahid Irfan

Maintained by Community

Actor stats

0

Bookmarked

18

Total users

2

Monthly active users

20 days ago

Last modified

Share

Shopify Product Scraper

Extract complete product catalogs from Shopify stores with automatic store detection, clean output, and variant-level detail. Use a base store URL or direct page URLs and collect normalized product records for monitoring, analytics, and catalog workflows.

Features

  • Automatic Shopify Detection โ€” Recognizes Shopify storefronts from direct website URLs and store URLs.
  • Flexible Starting Points โ€” Works with shopUrl and/or startUrls (homepages, collection pages, product pages).
  • Store and Collection Coverage โ€” Collects full catalog data or collection-specific products based on input.
  • Variant-Level Records โ€” Captures size/color variants with pricing and inventory attributes.
  • Clean Dataset Output โ€” Removes null/empty fields and avoids duplicate records.
  • Stock Filtering โ€” Optionally exclude out-of-stock products.
  • Production-Ready Metadata โ€” Adds store_url, source_type, and source_endpoint for traceability.

Use Cases

Competitor Catalog Monitoring

Track pricing, product launches, and catalog changes across Shopify competitors to support pricing and merchandising decisions.

E-commerce Data Warehousing

Build clean, structured product datasets for BI dashboards, internal reporting, and data pipelines.

Market and Assortment Analysis

Compare product mix, variant depth, and category coverage across multiple stores.

Product Feed Automation

Use scheduled runs to maintain up-to-date product feeds for internal tools, partner systems, or enrichment jobs.


Input Parameters

ParameterTypeRequiredDefaultDescription
shopUrlStringNo*โ€”Shopify store URL or direct store website URL, for example https://www.allbirds.com or allbirds.com.
startUrlsArrayNo*โ€”Optional list of URLs to start from. Accepts homepage, collection, and product URLs.
collectionStringNo"all"Collection handle when using shopUrl. Use all for full catalog coverage.
maxProductsIntegerNo50Maximum number of output records to save.
maxPagesIntegerNo999Maximum pagination depth per source feed.
includeVariantsBooleanNotrueInclude all variants per product in output.
includeOutOfStockBooleanNotrueInclude out-of-stock products.
proxyConfigurationObjectNo{ "useApifyProxy": true }Proxy settings for improved reliability on protected stores.

* Provide at least one of shopUrl or startUrls.


Output Data

Each dataset item contains normalized product or variant data:

FieldTypeDescription
idNumberShopify product ID.
variant_idNumberShopify variant ID (when available).
titleStringProduct title.
handleStringProduct handle used in product URL paths.
descriptionStringClean text description.
vendorStringProduct vendor/brand.
product_typeStringProduct category/type.
tagsArrayProduct tags.
variant_titleStringVariant title from store catalog.
variant_nameStringHuman-readable option summary.
skuStringSKU value.
barcodeStringVariant barcode if provided.
priceNumberCurrent price.
compare_at_priceNumberCompare-at price if provided.
currencyStringCurrency code.
availableBooleanAvailability status.
inventory_quantityNumberInventory quantity if provided.
inventory_policyStringInventory policy if provided.
weightNumberVariant weight if provided.
weight_unitStringWeight unit if provided.
requires_shippingBooleanShipping requirement flag.
imagesArrayImage URLs for product/variant.
featured_imageStringPrimary image URL.
urlStringProduct URL (variant URL when variant ID exists).
created_atStringProduct creation timestamp.
updated_atStringLast update timestamp.
published_atStringPublish timestamp.
store_urlStringSource store base URL.
source_typeStringSource feed classification for the record.
source_endpointStringSource endpoint URL used for extraction.
scraped_atStringTimestamp when record was scraped.

Usage Examples

Basic Store Run

{
"shopUrl": "https://www.allbirds.com",
"maxProducts": 50
}

Direct Website URL (Auto Detection)

{
"shopUrl": "colourpop.com",
"maxProducts": 100,
"includeVariants": true
}

Start from Specific URLs

{
"startUrls": [
{ "url": "https://www.allbirds.com/collections/all" },
{ "url": "https://www.allbirds.com/products/allbirds-laces" }
],
"maxProducts": 80
}

Collection-Specific Run

{
"shopUrl": "https://www.tentree.com",
"collection": "mens",
"maxProducts": 120,
"includeOutOfStock": false
}

Sample Output

{
"id": 6930107236432,
"variant_id": 40482722414672,
"title": "Women's Tree Runner Go - Natural Black (Blizzard Sole)",
"handle": "womens-tree-runner-go-natural-black-blizzard",
"description": "Made to Go with the flow, our fan-fave sneaker keeps its signature breathable comfort while hitting the refresh button with a new elevated aesthetic and more springy support.",
"vendor": "Allbirds",
"product_type": "Shoes",
"tags": [
"allbirds::complete => true",
"allbirds::gender => womens",
"shoprunner"
],
"variant_title": "5",
"variant_name": "Size: 5",
"sku": "A10608W050",
"price": 75,
"compare_at_price": 120,
"currency": "USD",
"available": true,
"requires_shipping": true,
"images": [
"https://cdn.shopify.com/s/files/1/1104/4168/files/A10609_24Q2_Tree-Runner-GO-Natural-Black-Blizzard_PDP_LEFT.png"
],
"featured_image": "https://cdn.shopify.com/s/files/1/1104/4168/files/A10609_24Q2_Tree-Runner-GO-Natural-Black-Blizzard_PDP_LEFT.png",
"url": "https://www.allbirds.com/products/womens-tree-runner-go-natural-black-blizzard?variant=40482722414672",
"created_at": "2024-02-09T02:48:11.000Z",
"updated_at": "2026-03-12T06:09:05.000Z",
"published_at": "2026-03-12T00:34:15.000Z",
"store_url": "https://www.allbirds.com",
"source_type": "store_products_json",
"source_endpoint": "https://www.allbirds.com/products.json?limit=250&page=1",
"scraped_at": "2026-03-12T06:09:04.855Z"
}

Tips for Best Results

Choose Valid Store URLs

  • Prefer canonical store homepages when possible.
  • If using direct pages, ensure they belong to the target store domain.

Start Small, Then Scale

  • Begin with maxProducts: 20 for quick validation.
  • Increase limits after confirming output quality.

Use Proxies for Protected Stores

  • Keep proxyConfiguration enabled for better consistency.
  • Residential proxy groups are recommended for stricter targets.

Control Variant Volume

  • Use includeVariants: false when you only need one record per product.
  • Keep includeVariants: true for complete size/color analytics.

Integrations

Connect scraped data with:

  • Google Sheets โ€” Build shareable product trackers.
  • Airtable โ€” Create searchable product databases.
  • Slack โ€” Notify teams on catalog updates.
  • Webhooks โ€” Push results into your own services.
  • Make โ€” Automate no-code workflows.
  • Zapier โ€” Trigger downstream actions.

Export Formats

  • JSON โ€” For applications and APIs.
  • CSV โ€” For spreadsheets and analysis.
  • Excel โ€” For business reporting.
  • XML โ€” For system integrations.

Frequently Asked Questions

Can I use a normal website URL instead of a Shopify subdomain?

Yes. You can pass direct store URLs like https://www.allbirds.com or even allbirds.com.

What if I only have product or collection URLs?

Use startUrls. The actor will resolve the store domain and collect available catalog data.

Will duplicate rows appear in the dataset?

No. Output is deduplicated at record level and cleaned before saving.

How are non-Shopify sites handled?

They are detected and skipped safely. The run still completes successfully.

Can I collect only in-stock products?

Yes. Set includeOutOfStock to false.

How do I keep runs fast for QA checks?

Use lower limits such as maxProducts: 20 and keep maxPages modest for test runs.


Support

For issues or feature requests, use Apify Console support.

Resources


This actor is intended for legitimate data collection. You are responsible for complying with target-site terms and applicable laws. Use reasonable request volume and handle data responsibly.