Shopify Product Scraper · Full Catalog Export by Store URL
Pricing
$2.99 / 1,000 results
Shopify Product Scraper · Full Catalog Export by Store URL
Turn any public Shopify storefront URL into a clean, export-ready product table: titles, handles, variants, SKUs, prices, availability, and image links. Ideal for merchandising teams, market research, and catalog backups. For custom pipelines or private catalog access, contact corentin@outreacher.fr
Pricing
$2.99 / 1,000 results
Rating
0.0
(0)
Developer
Corentin Robert
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
Shopify Store Product Scraper
Turn any public Shopify storefront into a clean product dataset you can open in Excel, plug into BI tools, or feed into your stack. Shopify only — use the shop URL your customers see (*.myshopify.com or a custom domain on Shopify). Other platforms are rejected before scraping with a clear error.
Built for: competitor benchmarking · marketplace research · catalog backups · merchandising analysis · enrichment pipelines
What it does
The Actor walks the public side of the store—the same information visible to shoppers—and assembles one row per variant (spreadsheet-friendly). It automatically tries the fastest path first and falls back when a store hides its default catalog view, so you get the broadest coverage possible without touching the merchant’s admin or private APIs.
By default there is no product limit per store (full public catalog). Use Maximum products per store only when you want a smaller sample.
What you get
| Category | Examples of what’s captured |
|---|---|
| Identity & links | Product and variant IDs, title, URL, handle |
| Merchandising | Brand/vendor, type, tags, options (size, color, …) |
| Pricing & stock signal | Price, compare-at price, currencies, availability when shown publicly |
| Logistics-friendly | SKU, barcode, weight, shipping/tax flags when exposed |
| Media | Main image, variant image, ordered gallery links |
| Timestamps | Created/updated where the storefront exposes them |
Fixed export shape (by design): no full body HTML on rows, no one-row-per-product mode, and no extra page-enrich columns—keeps runs fast and files easy to work with. For a custom build (fork), change the EXPORT_* and RUN_VERBOSE_LOGS constants at the top of src/main.js.
Not included: warehouse quantities, cost, private metafields, or anything that only exists in the merchant admin—that requires the merchant’s own APIs and permissions.
Important to know
- Public catalog only — no Shopify login, no Admin API, no secrets you don’t already have as a visitor.
- Some shops limit how much of the catalog is exposed automatically. The Actor uses fallback strategies; if a store is fully locked down, the run stops with a clear message. Running from Apify with Proxy can help when the store treats cloud traffic differently than a home browser.
- Terms and law — use the data in line with the store’s terms and your jurisdiction. You’re responsible for compliance.
Pricing (Apify Store)
On the Apify Store, this Actor uses pay-per-event pricing: you pay for what lands in your dataset.
| What you pay for | How it counts |
|---|---|
| Exported rows | $2.99 per 1,000 rows (billed proportionally). Each row is one dataset item—in practice usually one row per product variant, not per parent product. |
| Run start | A small default Actor start fee (Apify synthetic event—kept minimal). |
Quick examples (event charges only, rounded):
- 1,000 rows → about $2.99
- 5,000 rows → about $14.95
- 10,000 rows → about $29.90
Your Apify plan, compute (CU), and proxy usage may add separate platform charges—see Apify pricing and the run’s usage tab.
Control your bill: use Maximum products per store to cap how many products are processed (fewer products → fewer variants → fewer rows). In the Apify Console you can also set a maximum cost per run for pay-per-event Actors.
How to use (Apify Console)
- Open the Actor and add Shopify storefront URL(s) (field Shopify store URLs).
- Leave Maximum products per store empty (or 0) for the full catalog; enter a number (e.g. 25) only to cap how many products per store.
- Turn on Apify Proxy if the default run can’t reach the site reliably from the cloud.
- Click Start, then download from the Dataset tab (JSON, CSV, Excel, or HTML export). Use the Catalog overview view for a slim column set, or All fields for every captured column (no duplicate “full export” view).
During the run, Key-Value Store → RUN_LOG shows a plain-language progress log.
Input
| Field | Type | Description |
|---|---|---|
startUrls | string[] | Required. Shopify-only public storefront URLs (with or without https://). Non-Shopify sites fail the pre-check. |
maxProducts | integer | Omit, empty, or 0 = full public catalog per store. Any positive integer caps products (not variants) per store. |
proxyConfiguration | object | Apify Proxy settings. |
Optional fields (API or input.json only — not in the Console form): batchSize (default 5), handleRequestConcurrency (default 3), handleBatchCooldownMs (default 900), maxSitemapIndexes (default 10) for tuning large or strict storefronts. Omit for normal runs.
Full catalog (default)
{"startUrls": ["https://www.example.com/"]}
Sample only (25 products per store)
{"startUrls": ["https://www.example.com/"],"maxProducts": 25}
Output example
One row per variant (typical):
{"shopUrl": "https://www.example.com","productId": 1234567890,"variantId": 9876543210,"title": "Example frame","handle": "example-frame","vendor": "Example Brand","productType": "Eyewear","tags": ["optical", "new"],"sku": "SKU-001","price": "189.00","currency": "EUR","available": true,"option1": "Blue","imageUrl": "https://cdn.shopify.com/.../image.jpg","productUrl": "https://www.example.com/products/example-frame","scrapedAt": "2026-04-09T12:00:00.000Z"}
Download results from the Dataset tab in the format you need.
Local development
npm installnpm start
Optional: merge settings from input.json at the project root when not running on Apify.
If the CLI reports missing input after clearing storage:
$npm run reset-apify-input
If a local run can’t connect: check network, VPN, and any system proxy settings. For deeper request detail, set RUN_VERBOSE_LOGS to true in src/main.js (developer / fork only).
If a large catalog slows down or stalls: pass handleBatchCooldownMs (e.g. 4000) and handleRequestConcurrency (1 or 2) in API input, or use Apify Residential proxy for strict storefronts.
On local runs, output.csv is also written at the project root (Excel-friendly, ; separator) so you can open results immediately.
Tests: npm test
FAQ
Can I get stock levels or wholesale prices?
Only what the public storefront shows. True inventory and margin data live behind merchant tools.
Is this an official Shopify integration?
No—it reads the same public pages and feeds a shopper would see, assembled into a dataset for you.
What if I paste a non-Shopify URL?
The Actor runs a quick Shopify check (catalog JSON, theme assets, or cart API). If the site does not look like Shopify, the run stops with an error and no catalog is produced.
Why fewer rows than the website?
Draft items, hidden channels, or geo-gated catalogs may not appear in the public export.
How do I get help or a custom export?
See Support below.
Support
Contact corentin@outreacher.fr for custom scrapers, authenticated catalog access, or tailored fields.
Ready? Add your store URL, optionally cap Maximum products per store, and start a run.