Shopify Product Scraper · Full Catalog Export by Store URL avatar

Shopify Product Scraper · Full Catalog Export by Store URL

Pricing

$2.99 / 1,000 results

Go to Apify Store
Shopify Product Scraper · Full Catalog Export by Store URL

Shopify Product Scraper · Full Catalog Export by Store URL

Turn any public Shopify storefront URL into a clean, export-ready product table: titles, handles, variants, SKUs, prices, availability, and image links. Ideal for merchandising teams, market research, and catalog backups. For custom pipelines or private catalog access, contact corentin@outreacher.fr

Pricing

$2.99 / 1,000 results

Rating

0.0

(0)

Developer

Corentin Robert

Corentin Robert

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Share

Shopify Store Product Scraper

Turn any public Shopify storefront into a clean product dataset you can open in Excel, plug into BI tools, or feed into your stack. Shopify only — use the shop URL your customers see (*.myshopify.com or a custom domain on Shopify). Other platforms are rejected before scraping with a clear error.

Built for: competitor benchmarking · marketplace research · catalog backups · merchandising analysis · enrichment pipelines

What it does

The Actor walks the public side of the store—the same information visible to shoppers—and assembles one row per variant (spreadsheet-friendly). It automatically tries the fastest path first and falls back when a store hides its default catalog view, so you get the broadest coverage possible without touching the merchant’s admin or private APIs.

By default there is no product limit per store (full public catalog). Use Maximum products per store only when you want a smaller sample.

What you get

CategoryExamples of what’s captured
Identity & linksProduct and variant IDs, title, URL, handle
MerchandisingBrand/vendor, type, tags, options (size, color, …)
Pricing & stock signalPrice, compare-at price, currencies, availability when shown publicly
Logistics-friendlySKU, barcode, weight, shipping/tax flags when exposed
MediaMain image, variant image, ordered gallery links
TimestampsCreated/updated where the storefront exposes them

Fixed export shape (by design): no full body HTML on rows, no one-row-per-product mode, and no extra page-enrich columns—keeps runs fast and files easy to work with. For a custom build (fork), change the EXPORT_* and RUN_VERBOSE_LOGS constants at the top of src/main.js.

Not included: warehouse quantities, cost, private metafields, or anything that only exists in the merchant admin—that requires the merchant’s own APIs and permissions.

Important to know

  • Public catalog only — no Shopify login, no Admin API, no secrets you don’t already have as a visitor.
  • Some shops limit how much of the catalog is exposed automatically. The Actor uses fallback strategies; if a store is fully locked down, the run stops with a clear message. Running from Apify with Proxy can help when the store treats cloud traffic differently than a home browser.
  • Terms and law — use the data in line with the store’s terms and your jurisdiction. You’re responsible for compliance.

Pricing (Apify Store)

On the Apify Store, this Actor uses pay-per-event pricing: you pay for what lands in your dataset.

What you pay forHow it counts
Exported rows$2.99 per 1,000 rows (billed proportionally). Each row is one dataset item—in practice usually one row per product variant, not per parent product.
Run startA small default Actor start fee (Apify synthetic event—kept minimal).

Quick examples (event charges only, rounded):

  • 1,000 rows → about $2.99
  • 5,000 rows → about $14.95
  • 10,000 rows → about $29.90

Your Apify plan, compute (CU), and proxy usage may add separate platform charges—see Apify pricing and the run’s usage tab.

Control your bill: use Maximum products per store to cap how many products are processed (fewer products → fewer variants → fewer rows). In the Apify Console you can also set a maximum cost per run for pay-per-event Actors.

How to use (Apify Console)

  1. Open the Actor and add Shopify storefront URL(s) (field Shopify store URLs).
  2. Leave Maximum products per store empty (or 0) for the full catalog; enter a number (e.g. 25) only to cap how many products per store.
  3. Turn on Apify Proxy if the default run can’t reach the site reliably from the cloud.
  4. Click Start, then download from the Dataset tab (JSON, CSV, Excel, or HTML export). Use the Catalog overview view for a slim column set, or All fields for every captured column (no duplicate “full export” view).

During the run, Key-Value Store → RUN_LOG shows a plain-language progress log.

Input

FieldTypeDescription
startUrlsstring[]Required. Shopify-only public storefront URLs (with or without https://). Non-Shopify sites fail the pre-check.
maxProductsintegerOmit, empty, or 0 = full public catalog per store. Any positive integer caps products (not variants) per store.
proxyConfigurationobjectApify Proxy settings.

Optional fields (API or input.json only — not in the Console form): batchSize (default 5), handleRequestConcurrency (default 3), handleBatchCooldownMs (default 900), maxSitemapIndexes (default 10) for tuning large or strict storefronts. Omit for normal runs.

Full catalog (default)

{
"startUrls": ["https://www.example.com/"]
}

Sample only (25 products per store)

{
"startUrls": ["https://www.example.com/"],
"maxProducts": 25
}

Output example

One row per variant (typical):

{
"shopUrl": "https://www.example.com",
"productId": 1234567890,
"variantId": 9876543210,
"title": "Example frame",
"handle": "example-frame",
"vendor": "Example Brand",
"productType": "Eyewear",
"tags": ["optical", "new"],
"sku": "SKU-001",
"price": "189.00",
"currency": "EUR",
"available": true,
"option1": "Blue",
"imageUrl": "https://cdn.shopify.com/.../image.jpg",
"productUrl": "https://www.example.com/products/example-frame",
"scrapedAt": "2026-04-09T12:00:00.000Z"
}

Download results from the Dataset tab in the format you need.

Local development

npm install
npm start

Optional: merge settings from input.json at the project root when not running on Apify.

If the CLI reports missing input after clearing storage:

$npm run reset-apify-input

If a local run can’t connect: check network, VPN, and any system proxy settings. For deeper request detail, set RUN_VERBOSE_LOGS to true in src/main.js (developer / fork only).

If a large catalog slows down or stalls: pass handleBatchCooldownMs (e.g. 4000) and handleRequestConcurrency (1 or 2) in API input, or use Apify Residential proxy for strict storefronts.

On local runs, output.csv is also written at the project root (Excel-friendly, ; separator) so you can open results immediately.

Tests: npm test

FAQ

Can I get stock levels or wholesale prices?
Only what the public storefront shows. True inventory and margin data live behind merchant tools.

Is this an official Shopify integration?
No—it reads the same public pages and feeds a shopper would see, assembled into a dataset for you.

What if I paste a non-Shopify URL?
The Actor runs a quick Shopify check (catalog JSON, theme assets, or cart API). If the site does not look like Shopify, the run stops with an error and no catalog is produced.

Why fewer rows than the website?
Draft items, hidden channels, or geo-gated catalogs may not appear in the public export.

How do I get help or a custom export?
See Support below.

Support

Contact corentin@outreacher.fr for custom scrapers, authenticated catalog access, or tailored fields.

Ready? Add your store URL, optionally cap Maximum products per store, and start a run.